Setting the Client Character Set

Liso77 commented 11 months ago

Hello! :) I like to test oracle implicit conversion between server and client.

Documentation says

* - NLS_LANG
      - Determines the 'national language support' globalization options for
        python-oracledb. Note that from cx_Oracle 8, the character set component is
        ignored and only the language and territory components of ``NLS_LANG``
        are used. The character set can instead be specified during connection
        or connection pool creation. See :ref:`globalization`.

(source: https://github.com/oracle/python-oracledb/blob/9f1019a10813d98fe667283e04d6b64d05438867/doc/src/user_guide/initialization.rst#L530 )

I didn't find any possibilities to specify character set. Could you please help me?

Instead I found this:

Setting the Client Character Set
In python-oracledb, the encoding used for all character data is "UTF-8". 
The encoding and nencoding parameters of the `oracledb.connect` and `oracledb.create_pool` 
methods are deprecated and ignored.

(source: https://github.com/oracle/python-oracledb/blob/9f1019a10813d98fe667283e04d6b64d05438867/doc/src/user_guide/globalization.rst#setting-the-client-character-set )

:(

cjbj commented 11 months ago

The "client" character set is not changeable. You can use another tool - or write your own!!

Liso77 commented 11 months ago

Is there place where I could hack it in source code? (EDIT: I mean if there is possible simple fix)

BTW it could be good to fix doc in initialization.rst to not create false hope ;)

anthony-tuininga commented 11 months ago

We can adjust the documentation to remove that "false hope" you mentioned. Your requirement is unusual, however! Most people use the driver to actually fetch or insert data and the encoding used for communication between database and client is not of any importance for doing that. :-)

As for modifying the code, you can easily enough remove the code to set the defaultEncoding value of the dpiContextCreateParams structure inside the function init_oracle_client(). After that, though, you also have to find the actual encoding used and then pass that through to every call to string.decode() and string.encode() that is found in the driver (the ones in the thick and base portions of the driver). There are several dozen of those! Since the thin driver never had any support for encodings other than UTF-8 I don't even know what changes would be required there. Good luck!

One final note: Oracle has a lot of character sets that don't map to IANA encodings. These will be impossible to use. Most of them are very old so hopefully you don't need to pay attention to them!

Liso77 commented 11 months ago

Ty both ! :)

I could try describe my intention in next "example":

Find VALUE which using:

select utl_raw.cast_to_varchar2(hextoraw(VALUE)) from dual

from ALL32_UTF8 database server into client with EE8ISO8859P2 character set after Oracle's implicit conversion return bytes(range(256)).

My hypothesis is that it is unsolvable, but I like to try it and oracledb seems (or seemed?) to be is fantastic tool to do these tests. :)

In real world my problem is testing possibility to configure c++ binary client (without rewriting and compiling) same way after converting DB server to utf8.

I would prefer to have oracledb better suited for this kind of research but I understand your intention to have simpler code. So it is time to close this issue. :)

oracle / python-oracledb

Setting the Client Character Set #210