Oracle: Timestamp not extracted correctly

jayctran commented 1 month ago

Thanks for all your efforts on maintaining this library.

I've pulled out the relevant portions of the logs:

First record arrow-odbc vs pyodbc:

pyodbc did truncate to 6 digits but microseconds is acceptable for our use case.

pacman82 commented 1 month ago

Hello @jayctran ,

thanks for providing the output. This is enough information to figure out what is happening. arrow-odbc uses the precision reported by the driver to figure out if the timestamp should be considered of nano-, micro-, milliseconds precision or even with out any fraction seconds.

You can find the logic here: https://github.com/pacman82/arrow-odbc/blob/2b4e2a33a8c7615feb75045e6831d549b18fbce2/src/schema.rs#L86

I do not know what pyodbc does, but I assume it just hard codes microseconds precision.

As you can see from the printed debug output the Oracle ODBC driver reports only a precision of 0. Given the relational type you specified this should be 0. I would argue this is a bug with the Oracle Driver. In case you would like to report this bug upstream the fault is likely within the implementation of SQLDescribeCol.

Luckily for you arrow-odbc offers a workaround. You can either manually specify an arrow schema (using the aptly named schema parameter). In case you need this to work with an unknown schema you can also supply a callable via the map_schema parameter. In this callable you could map every field type to its identity except the timestamps of seconds precision to those of microseconds. You can also get even better precision than pyodbc by mapping to nanoseconds. Likely only the first seven digits will be different from zero though, but this is a tale for another day ...

Best, Markus

pacman82 commented 1 month ago

Here is link to a test demonstrating one possible use of map schema for a different workaround:

https://github.com/pacman82/arrow-odbc-py/blob/372ba4bbfb3a619d4f0e199484e5387d9ed3e06c/tests/test_arrow_odbc.py#L651

jayctran commented 1 month ago

Hi Markus,

Thanks for the detailed response! That is brilliant and it's exactly what I was looking for!

I'm using polars so will test to see if the options passthrough will work with map_schema.

Thanks again! James

pacman82 / arrow-odbc-py

Oracle: Timestamp not extracted correctly #107