oracle / python-oracledb

Python driver for Oracle Database conforming to the Python DB API 2.0 specification. This is the renamed, new major release of cx_Oracle
https://oracle.github.io/python-oracledb
Other
308 stars 61 forks source link

How to bulk load to oracle table when column has missing data NaN #249

Closed Newbee25 closed 7 months ago

Newbee25 commented 8 months ago

I'm trying to bulk load a dataframe into an oracle table. The following error occurs:

_oracledb.exceptions.NotSupportedError: DPY-3013: unsupported Python type str for database type DB_TYPENUMBER

No error occurs when the records are inserted one by one.

Oracle table:

Column_Name Data_Type
Make VARCHAR2(20 BYTE)
Model VARCHAR2(20 BYTE)
Weight NUMBER
Year NUMBER(38,0)

Python Code

import pandas as pd
import numpy as np

data = {
  'Make': ['Honda', 'Toyota', 'BMW', 'Honda'],
  'Model': ['Civic', 'Camry', 'X5', 'Accord'],
  'Weight': [1020.25, np.NaN, 1235.37, 940.9],
  'Year': [2021, 2023, 2019, np.NaN]
}

df = pd.DataFrame(data)

print(df)
print(df.dtypes)

try:
    with db_connection.cursor() as cursor:

        cursor.executemany('insert into myTable values(:1, :2, :3, :4)', df.fillna('').values.tolist()) 

    db_connection.commit()

except Exception as error:
     print(error) 

Please note the floats NaN must be inserted as null values to the table.

I attempted df['Weight'].fillna(float('NaN'), inplace=True)and df['Year'].fillna(float('NaN'), inplace=True) I also attempted to replace NaN with None.

It seems it is not recognizing None and NaN as valid types.
Is issue a possible bug?

I'm using the thick client mode and it is calling init_oracle_client()

Version

python-oracledb 1.4.1 Python 3.9.16

anthony-tuininga commented 7 months ago

You can use an input type handler to transform the np.NaN values to whatever value you choose. Or you can use the method you suggested. Both should work fine. The value None is definitely recognized as a null value. The value NaN is also supported -- but only for insert into a binary_double or binary_float column. NUMBER columns do not support those values. Can you provide the full code that failed when trying with a None value? If you still see a problem I'll give it a try.

anthony-tuininga commented 7 months ago

This has been confirmed to work:

import math

def input_type_handler(cursor, value, arraysize):
    if isinstance(value, float):
        return cursor.var(oracledb.DB_TYPE_NUMBER, arraysize=arraysize,
                          inconverter=lambda x: None if math.isnan(x) else x)
db_connection.inputtypehandler = input_type_handler
cjbj commented 7 months ago

Closing - no activity.