exasol / script-languages

Base Repository for the Script Language Container for user defined functions (UDF's) that can be used in the EXASOL database. You can find the release repository under https://github.com/exasol/script-languages-release
https://docs.exasol.com/database_concepts/udf_scripts.htm
MIT License
13 stars 15 forks source link

#902: fixed memory related bugs with emit dataframe #414

Closed tomuben closed 4 months ago

tomuben commented 5 months ago

fixes #902

3 problems were identified, first 2 are memory related:

1. Numpy object leaked

Py-Object returned from

PyArray_FROM_OTF(data.get(), NPY_OBJECT, NPY_ARRAY_IN_ARRAY))

also needs to be deallocated (call to Py_XDECRED()). In current implementation, we decreased reference counter only for the transposed array. Debugging showed the reference counter:

Ref count of colArray = 1
Ref count of pyArray = 2

This mean the array retrieved from PyArray_Transpose() is a new object

=> We need to decrease reference counter for both.

2. Items returned from PyList_GetItem() must not be released

See documentation

...
Return value: Borrowed reference. Part of the [Stable ABI](https://docs.python.org/3/c-api/stable.html#stable)
...

3. emit with datetime only object fails

Running emit on a dataframe which contains only datetime64[ns] columns fails with error message:

pyodbc.DataError: ('22002', '[22002] [EXASOL][EXASolution driver]VM error: F-UDF-CL-LIB-1127: F-UDF-CL-SL-PYTHON-1002: F-UDF-CL-SL-PYTHON-1026: ExaUDFError: F-UDF-CL-SL-PYTHON-1114: Exception during run \nTEST_DTYPE_EMIT:7 run\nRuntimeError: F-UDF-CL-SL-PYTHON-1136: F-UDF-CL-SL-PYTHON-1130: PyObject is unexpectedly a null pointer\n (Session: 1800240827916484608) (-3452546) (SQLExecDirectW)')

Reason is that the default conversion to numpy expects only objects as cell items. For the case where only one column of type NPY_DATETIME is in the source dataframe, a workaround was already implemented (see here). Solution: Convert all items in the dataframe to type object if all columns are of type NPY_DATETIME.

Minor changes