oracle / python-cx_Oracle

Python interface to Oracle Database now superseded by python-oracledb
https://oracle.github.io/python-cx_Oracle
Other
890 stars 361 forks source link

Random character is displayed when unicode data is fetched. #157

Closed sachin-aryal closed 6 years ago

sachin-aryal commented 6 years ago
  1. What is your version of Python? Is it 32-bit or 64-bit?

Python 2.7 and 64-bit

  1. What is your version of cx_Oracle?

6.0.2

  1. What is your version of the Oracle client (e.g. Instant Client)? How was it installed? Where is it installed?

11.2.0.4.0 Unzipped and oracle client files and set the environment variables. /Applications/oracle/product/instantclient_64/11.2.0.4.0/

  1. What is your version of the Oracle Database? Oracle is Installed on Docker

Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit Production PL/SQL Release 11.2.0.2.0 - Production CORE 11.2.0.2.0 Production TNS for Linux: Version 11.2.0.2.0 - Production NLSRTL Version 11.2.0.2.0 - Production

  1. What is your OS and version?

os x el capitan version 10.11

  1. What compiler version did you use? For example, with GCC, run gcc --version.

Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/usr/include/c++/4.2.1 Apple LLVM version 8.0.0 (clang-800.0.42.1) Target: x86_64-apple-darwin15.0.0 Thread model: posix InstalledDir: /Library/Developer/CommandLineTools/usr/bin

  1. What environment variables did you set? How exactly did you set them?

export ORACLE_HOME=/Applications/oracle/product/instantclient_64/11.2.0.4.0 ORACLE_SID=xe export ORACLE_SID export PATH=$ORACLE_HOME/bin:$PATH export DYLD_LIBRARY_PATH=$ORACLE_HOME/lib

I am doing something like this in my code

import cx_Oracle
connection_string = "%s:%s/%s" % ("192.168.0.100", "1521", "xe")
connection = cx_Oracle.connect("system", "oracle", connection_string, encoding="UTF-8", nencoding="UTF-8")
cur = connection.cursor()
print ("Connection Version: {}".format(connection.version))
print(connection.encoding)
print(connection.nencoding)
query = "select *from product_information"
cur.execute(query)
records = cur.fetchmany(1000)
for r in records:
    print r
    print (r[2])

I got the output like this

UTF-8 UTF-8 (3, u'testing', 'test-\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd') test-������������

I am using the following query to create a table in Oracle database

CREATE TABLE product_information 
    ( product_id          NUMBER(6) 
    , product_name        NVARCHAR2(100) 
    , product_description VARCHAR2(1000));

I used the following query to insert data

insert into product_information values(2, 'teting', 'test-दुःख');

Query: SELECT * from NLS_DATABASE_PARAMETERS WHERE parameter IN ( 'NLS_LANGUAGE', 'NLS_TERRITORY', 'NLS_CHARACTERSET');

Result

NLS_LANGUAGE: AMERICAN, NLS_TERRITORY: AMERICA, NLS_CHARACTERSET: AL32UTF8

anthony-tuininga commented 6 years ago

That data appears to be messed up. Did you insert that string using Python or SQL*Plus? If Python, can you post the script you used? If SQL*Plus, did you make sure you set NLS_LANG=.AL32UTF8 before running SQL*Plus?

If I insert that data, ensuring that I inserted it using the UTF-8 encoding, I get the following result when I query it using your code:

(2, u'teting', 'test-\xe0\xa4\xa6\xe0\xa5\x81\xe0\xa4\x83\xe0\xa4\x96')

That is expected. If I take that result and decode it (product_description.decode("UTF-8")) then I get this result:

(2, 'teting', 'test-दुःख')

That also happens to be the same result that I get if I use Python 3, which make the whole unicode thing much simpler to manage! :-)

sachin-aryal commented 6 years ago

I insert the data with SQL*PLUS. Where should I set NLS_LANG=.AL32UTF8 ? When I do this

query = "select *from product_information"
cur.execute(query)
records = cur.fetchmany(1000)
for r in records:
    print(r)
    print (r[2].decode("UTF-8"))

I got following output

2, 'teting', 'test-������������') Traceback (most recent call last): File "test.py", line 24, in print (r[2].decode("UTF-8")) AttributeError: 'str' object has no attribute 'decode'

anthony-tuininga commented 6 years ago

You need to set the environment variable NLS_LANG to the value ".AL32UTF8" prior to executing SQL*Plus. You can usually do this in the following fashion:

NLS_LANG=.AL32UTF8 sqlplus

If you use Python 3.x you will get the error you noted when you attempt to decode using a string. Using Python 2.x you will not get that error (bytes and str are the same object).

sachin-aryal commented 6 years ago

When I set the NLS_LANG=.AL32UTF8 as the environment variable. It worked perfectly thank you. Do I need to set the environment variable in the system where Oracle is installed or I can set it with python?

anthony-tuininga commented 6 years ago

The environment variable can be set, but if you use the parameters encoding = "UTF-8" and nencoding = "UTF-8" when you create your connection in Python that is doing the equivalent thing and you don't need the environment variable. When you use SQL*Plus, though, you always need the environment variable.

sachin-aryal commented 6 years ago

Okay sure. It solved my problem.

anthony-tuininga commented 6 years ago

Excellent.

arvindsam96 commented 4 years ago

while retrieving data from table french data is converted to unknown variables From Table: LE VILLAGE DU P? NO?(APP)-PIP PAKBASIC-AMQUI

Actual data: LE VILLAGE DU PÈRE NOËL (APP) -PIP PAKBASIC-AMQUI

i have tried setting NLS_LANG also import os os.environ["NLS_LANG"] = 'AMERICAN_AMERICA.WE8ISO8859P1' import cx_Oracle

but nothing seems to be working for me.

my NLS_DATABASE_PARAMETERS are [('NLS_RDBMS_VERSION', '12.1.0.2.0'), ('NLS_NCHAR_CONV_EXCP', 'FALSE'), ('NLS_LENGTH_SEMANTICS', 'CHAR'), ('NLS_COMP', 'BINARY'), ('NLS_DUAL_CURRENCY', '$'), ('NLS_TIMESTAMP_TZ_FORMAT', 'DD-MON-RR HH.MI.SSXFF AM TZR'), ('NLS_TIME_TZ_FORMAT', 'HH.MI.SSXFF AM TZR'), ('NLS_TIMESTAMP_FORMAT', 'DD-MON-RR HH.MI.SSXFF AM'), ('NLS_TIME_FORMAT', 'HH.MI.SSXFF AM'), ('NLS_SORT', 'BINARY'), ('NLS_DATE_LANGUAGE', 'AMERICAN'), ('NLS_DATE_FORMAT', 'DD-MON-RR'), ('NLS_CALENDAR', 'GREGORIAN'), ('NLS_NUMERIC_CHARACTERS', '.,'), ('NLS_NCHAR_CHARACTERSET', 'AL16UTF16'), ('NLS_CHARACTERSET', 'AL32UTF8'), ('NLS_ISO_CURRENCY', 'AMERICA'), ('NLS_CURRENCY', '$'), ('NLS_TERRITORY', 'AMERICA'), ('NLS_LANGUAGE', 'AMERICAN')]

anthony-tuininga commented 4 years ago

Use a new issue instead of tacking your question on to an old closed issue, please!

To briefly answer your question: you should be creating your connection as follows:

connection = cx_Oracle.connect("user/pwd@dsn", encoding="UTF-8", nencoding="UTF8")

You can see more detailed instructions in the documentation.

arvindsam96 commented 4 years ago

Thanks for the reply. I have tried this also. No changes in table retrieved data (invalid characters)

I have raised a new issue also. https://github.com/oracle/python-cx_Oracle/issues/400

arvindsam96 commented 4 years ago

Python 3.7.1 cx_Oracle 7.2.0 Oracle databse 12.1.0.2.0

i am using curosr.execute(query) to fetch data

arvindsam96 commented 4 years ago

----------------------------------------------------------------Modules

import os os.environ["NLS_LANG"] = 'AMERICAN_AMERICA.WE8ISO8859P1' import cx_Oracle import logger

-----------------------------------------------------------------------

Create logger instance log = logger.getlogger()

-----------------------------------------------------------------------

def oraclesql(datasource, username, password): dsn_tns = '' conn = '' cursor = '' try: dsn_conn = datasource.split('/')

get host and port

host_port = dsn_conn[0].split(':') log.info('connecting to hostname: {0} port: {1} service: {2}' .format(str(host_port[0]), host_port[1], str(dsn_conn[1])))

create dsn

dsn_tns = cx_Oracle.makedsn(str(host_port[0]), host_port[1], service_name=str(dsn_conn[1])) conn = cx_Oracle.connect(user=str(username), password=str(password), dsn=dsn_tns) cursor = conn.cursor() log.info('connection encoding is: {}/{}'.format(conn.encoding, conn.nencoding)) log.info('connected to DB version: {}'.format(conn.version)) except Exception as oracle_error: content = ('oracle '+dsn_tns+' is unreachable, Error: '+ str(oracle_error)).strip() log.error(content)

cjbj commented 4 years ago

@arvindsam96 Use a new issue.