oracle / python-oracledb

Python driver for Oracle Database conforming to the Python DB API 2.0 specification. This is the renamed, new major release of cx_Oracle
https://oracle.github.io/python-oracledb
Other
307 stars 59 forks source link

ERROR: 'utf-8' codec can't decode byte 0x85 in position 0: invalid start byte #336

Closed anthony-tuininga closed 1 month ago

anthony-tuininga commented 1 month ago

Discussed in https://github.com/oracle/python-oracledb/discussions/335

Originally posted by **bnvader** May 16, 2024 I am trying to use python oracledb to connect to Oracle Database that has encoding - ISO-8859-1. I have a custom database type of type DB Table. When I use thick client mode, I am able to directly use the DBObject.Attribute_Name to retrieve the attribute values from a row. However this same step fails if I do not use thick mode. I would like to have this working in thin mode as I want to eventually use this from a simple AWS Lambda function. Has anyone been able to use thin mode to successfully decode non utf-8 encoded Oracle DB? If so what is the secret?
anthony-tuininga commented 1 month ago

I am able to confirm that the issue is due to the presence of an xmltype attribute within a database object and will see what is needed to address this issue.

anthony-tuininga commented 1 month ago

Good news! I was able to correct the issue. If you are able to build from source you can verify that it works for you, too.

bnvader commented 1 month ago

Great. I have not been building or installing from source. I am using Anaconda/Spyder IDE for this development and not sure how to build from source using that. But let me check.

bnvader commented 1 month ago

I was able to build from source and tried to replace the oracledb folder content with the new local oracledb folder content but must be missing something. Sorry, I'm new to python and may not fully understand the module loading part here.

It fails with error - File ~\Documents\software\python\batch_extracts\scripts./txgov_batch_audit_db.py:12 import oracledb File ~\AppData\Local\anaconda3\Lib\site-packages\oracledb__init.py:43 from . import base_impl, thick_impl, thin_impl File ~\AppData\Local\anaconda3\Lib\site-packages\oracledb\base_impl.py:9 bootstrap__() File ~\AppData\Local\anaconda3\Lib\site-packages\oracledb\base_impl.py:7 in bootstrap mod = importlib.util.module_from_spec(spec) ImportError: DLL load failed while importing base_impl: The specified module could not be found.

anthony-tuininga commented 1 month ago

I don't know what games Anaconda plays with modules. In a regular installation, however, there should not be a base_impl.py but only a base_impl.pyd! It might be better to rename the original directory and copy the new content from the build directory to the same named directory. It is possible that not all of the files were replaced and are wreaking havoc! Another option that I use is to simply set the environment variable PYTHONPATH to point to the build directory. You can also run python setup.py build install and that should also work.

bnvader commented 1 month ago

I did just copy the oracledb folder in its entirety to the location where anaconda keeps the sitepackages. But it is failing to load. I guess I will just wait for this to be available where I can install it using pip install from anaconda as before.

anthony-tuininga commented 1 month ago

It might be easier for you to do this, then, in the source directory:

python -m build

This will create a wheel in the dist subdirectory. You can then install that wheel in the same way you install other wheels.

If the build module is missing you can do this:

python -m pip install build

Let me know if that works better for you. We can update the build from source instructions to give that option.

bnvader commented 1 month ago

Good news! Was able to install the package directly to conda from git. I was able to validate the fix! Thanks much for fixing this so quick.

bnvader commented 1 month ago

When will this fix be available as a release?

anthony-tuininga commented 1 month ago

We will discuss internally whether it makes sense to create a patch release in the next week or two; otherwise, it would likely be in a couple of months. Do you need this sooner rather than later? :-)

bnvader commented 1 month ago

A patch release will be great if possible as it would ease our deployment process to aws probably. Meanwhile could you pl point me to steps to install oracledb from git source to a user defined directory path on Linux. It looks like python -m pip install build will install to python lib path which may not be accessible to the user. Are the contents of build/lib folder after running the build command, sufficient to import and run the library? Thanks.

bnvader commented 1 month ago

Update: We are facing an error trying to install the beta version from git source on AWS ec2 cloudshell. building 'oracledb.base_impl' extension gcc -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -ftree-vectorize -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -O2 -ftree-vectorize -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -O2 -ftree-vectorize -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python3.9 -c src/oracledb/base_impl.c -o build/temp.linux-x86_64-cpython-39/src/oracledb/base_impl.o src/oracledb/base_impl.c:49:10: fatal error: Python.h: No such file or directory 49 | #include "Python.h" | ^~~~~~ compilation terminated.

Is there any possibility you could provide a linux zip dist here for us to test it on linux.

anthony-tuininga commented 1 month ago

49 | #include "Python.h" | ^~~~~~ compilation terminated.

That error implies that you don't have the Python development package installed. You should be able to install that package fairly easily.

A patch release will be great if possible as it would ease our deployment process to aws probably.

We will plan for a patch release. Exact timing is yet to be determined but it should be sometime in the next week or two.

Meanwhile could you pl point me to steps to install oracledb from git source to a user defined directory path on Linux.

My suggestion is to use the build module to create a wheel and then install the wheel the same way you would any other wheel. You can run setup.py build install as well. You can force the package to write to a user directory with the --user option but recent versions of Python already check for writability of the target directory and automatically invoke that option if needed.

Is there any possibility you could provide a linux zip dist here for us to test it on linux.

For Linux a manylinux wheel is going to be the most likely option to succeed. A straight zip file may or may not work depending on the different versions of Linux that we may be running! Since you have been able to compile already and verify that it works for you, creating a wheel shouldn't be too much trouble using the build module. That's going to be the best option I think.

bnvader commented 1 month ago

I have a windows laptop and hence have tested everything on windows. The lambda environment is rhel Linux and unfortunately I do not have access to it. Will I be able to build a manylinux wheel for linux from my local windows install?

cjbj commented 1 month ago

Until we release an update on PyPi, options to get Linux include using VirtualBox, or a free 'Oracle Compute Instance' on https://www.oracle.com/cloud/free/

bnvader commented 1 month ago

We are struggling a bit getting this build from source working in the AWS Lambda environment where we have to eventually run this. The pip install option had worked fine previously but creating the site-package from git source has been a challenge for the devs.

At the moment it fails to import oracledb with the following error - "errorMessage": "/lib64/libc.so.6: version `GLIBC_2.34' not found (required by /opt/python/lib/python3.11/site-packages/oracledb/thick_impl.cpython-311-x86_64-linux-gnu.so)", "errorType": "ImportError",

Any familiarity with this? Not sure if the issue is with the AWS Lambda shell or the way we are packaging the library.

anthony-tuininga commented 1 month ago

Generally you have to create a manylinux wheel or you have to make sure you build your wheel on an older platform than the one you are trying to distribute to! Building a manylinux wheel is fairly straightforward. I don't know what AWS uses, but assuming it uses the x86_64 platform, this is what I use for building wheels:

#! /usr/bin/bash
# Produces "manylinux" wheels using a container image built on CentOS 7. For
# additional information, see https://github.com/pypa/auditwheel.
#
#   Currently using manylinux2014 (based on CentOS 7):
#       podman pull quay.io/pypa/manylinux2014_x86_64
#
# This script should be run in the root directory of a clone of python-oracledb
# with a source distribution package already created and stored within the
# "dist" subdirectory. Once all of the wheels have been built they will be
# placed within the "dist" subdirectory as well.

# ensure that the dist subdirectory exists
if [ ! -d "dist" ]; then
    mkdir dist
fi

# generate script for building
SCRIPT_NAME=dist/linux_build_on_container.sh
cat > $SCRIPT_NAME << EOF
#! /bin/bash

cd /io

# build module for all supported Python versions
/opt/python/cp37-cp37m/bin/python3.7 -m build
/opt/python/cp38-cp38/bin/python3.8 -m build
/opt/python/cp39-cp39/bin/python3.9 -m build
/opt/python/cp310-cp310/bin/python3.10 -m build
/opt/python/cp311-cp311/bin/python3.11 -m build
/opt/python/cp312-cp312/bin/python3.12 -m build

# turn the base wheels into "manylinux" wheels
cd dist
auditwheel repair *.whl
rm -f oracledb-*x86_64.whl
mv -i wheelhouse/* .
rm -rf wheelhouse

exit

EOF
chmod +x $SCRIPT_NAME

# run script
sudo podman run -i -t -v `pwd`:/io quay.io/pypa/manylinux2014_x86_64 \
        /bin/bash -c /io/$SCRIPT_NAME
rm $SCRIPT_NAME

A few notes to help you if you want to use this approach:

Hope that helps! We have discussed creating a patch release and have tenatively scheduled that for early next week. Once it is out I will post here again.

anthony-tuininga commented 1 month ago

This was included in version 2.2.1 which was just released.