Closed PreHCM-diba closed 8 months ago
hi @gvenzl, first of all thanks for the great work!
i am also struggling with this: i want to utilize gvenzl/oracle-free:23.3-slim-faststart
in testcontainers
with a baseline schema/data from a WE8ISO8859P9 DB which has varchar2(50) columns storing chars like Ş, Ü, İ with length of 48 (with help of these 2-byte chars, total byte length for the column is around 51/52 on some rows in AL32UTF8).
i tried to alter length semantics but could not find a way to change it on DB level (i managed to set it on instance or session but my insert statements are still failing with "...too small..." issue).
i understand the value of AL32UTF8 but i can't change the prod DB ¯\(ツ)/¯, being able to adjust the test setup in-line with prod would be great.
and to be clear: i am not expert on oracle, i assumed my problem is related with this but i could probably got it all wrong. so, would love to hear your feedback on it.
Hi @PreHCM-diba, hi @akurone,
Thanks to the both of you for using these images!
@PreHCM-diba,
The NLS_LANG
environment variable is only defaulted to .AL32UTF8
. This can always be overwritten via the -e
option when running the container:
[gvenzl@localhost ~]$ podman run -d --name test \
> -e ORACLE_PASSWORD=LetsTest1 \
> -e NLS_LANG=.WE8ISO8859P1 \
> gvenzl/oracle-free:slim-faststart
[gvenzl@localhost ~]$ podman exec -ti test bash
bash-4.4$ echo $NLS_LANG
.WE8ISO8859P1
@akurone,
The database character set, unfortunately, is a different beast. It can only be defined during database creation and not easily changed afterward. Hence, the only logical path was to pick the since 11gR2 (2007) recommended default of AL32UTF8 that will be able to store any character in today's existence.
The only viable path here would be to build your own, customized docker image, or provide the image with a volume that contains such a customized Oracle Free database.
Fortunately, I think, there is another possibility for you here that hopefully does the trick.
Oracle Database allows to specify the character length semantics to be either in bytes (default) or characters.
The default can be controlleded via the NLS_LENGTH_SEMANTICS parameter.
If you were to set this parameter to the value CHAR
, the table containing the varchar2(50) column will have enough space to store 50 UTF-8 characters, instead of 50 bytes. Logically, nothing should change for your app, functions like LENGTH()
, etc. will still report a length of one character, not the byte, for example:
The insert below fails because the Ş
character requires 2 bytes rather than one:
SQL> create table foo (t varchar2(1));
Table created.
SQL> insert into foo values ('Ş');
insert into foo values ('Ş')
*
ERROR at line 1:
ORA-12899: value too large for column "SYSTEM"."FOO"."T" (actual: 2, maximum:
1)
Help: https://docs.oracle.com/error-help/db/ora-12899/
But when creating the table with a length of 1 CHAR
, the character now fits:
SQL> create table foo1 (t varchar2(1 char));
Table created.
SQL> insert into foo1 values ('Ş');
1 row created.
When you do a describe on the table, you will see the different length semantics in the output:
SQL> desc foo
Name Null? Type
----------------------------------------- -------- ----------------------------
T VARCHAR2(1)
SQL> desc foo1
Name Null? Type
----------------------------------------- -------- ----------------------------
T VARCHAR2(1 CHAR)
And when you do a select length()
over that one character, Oracle will still return 1:
SQL> select length(t) from foo1;
LENGTH(T)
----------
1
Instead of changing your DDL scripts to type CHAR
everywhere, you can just set NLS_LENGTH_SEMANTICS=CHAR
:
SQL> alter session set nls_length_semantics=char;
Session altered.
SQL> create table foo2 (t varchar2(1));
Table created.
SQL> desc foo2
Name Null? Type
----------------------------------------- -------- ----------------------------
T VARCHAR2(1)
SQL> insert into foo2 values ('Ş');
1 row created.
SQL> select length(t) from foo2;
LENGTH(T)
----------
1
SQL> desc foo;
Name Null? Type
----------------------------------------- -------- ----------------------------
T VARCHAR2(1 BYTE)
I hope this helps!
thanks @gvenzl, both for sharing these images and for the detailed response!
i am currently using the NLS_LENGTH_SEMANTICS
trick; meanwhile we are exploring official dockerfiles (also made by you i guess:)) with my dba colleagues to build a custom image.
Hey @akurone,
You are very welcome!
Ha, yes, once upon a time but now they are in the hands of my capable colleagues. :)
The official EE
/SE2
images do have an environment variable for the character set -e ORACLE_CHARACTERSET=
as they are creating the database at first startup.
You should be able to have a custom Dockerfile that creates a database with the character set you are looking for and then commits a new image for you.
Would be nice to have a possibility to set the NLS_LANG parameter, to also have the possibility to import data that was exported from another database, correctly. Curently it's fixes to AL32UTF8, butwould be nice to make it configurable to set it also to e.g. WE8ISO8859P1