Closed volcan01010 closed 1 year ago
Actually, I've changed my mind. read_lob
is a specific transform and should be carried out by the transform function. It should be deprecated and replaced with an example transform.
Also, it is possible to add an extra connection setting to read the LOBs directly:
https://cx-oracle.readthedocs.io/en/latest/user_guide/lob_data.html
This makes things more than 10x faster for me.
You can also get Oracle to do the conversion via the SQL request, but the resulting VARCHAR is limited to 4,000 characters so it won't work for large geometry objects with many vertices.
SELECT DBMS_LOB.SUBSTR( lob_column, 4000, 1 ) FROM table
The simplest solution for this is to make all Oracle connections supplied by ETL Helper turn on the reading of CLOBs and BLOBs as strings and bytes by default. ETL Helper is aimed at spatial databases, where this is required. It brings the behaviour of Oracle in line with PostgreSQL text
fields, too.
This will be a breaking change. If users need to to access LOBs as streaming objects then they will have to provide their own connection or drop the output_type_handler
once the connection has been made.
The new Oracle driver has a flag to read LOBs as Python objects: https://cjones-oracle.medium.com/open-source-python-thin-driver-for-oracle-database-e82aac7ecf5a
Note: Now that #151 has been merged, we should test how the new oracledb
driver behaves with CLOBS (these are often used when returning geometries as WKT). If it is signficantly faster to return them as strings (I think that it will be), then we should set that as the default and document how to change it.
cc: @sophie-taylor @leorudczenko
Closing as this has been merged into for_v1
.
Summary
As an ETL user, I want read_lob to be available to all fetching methods so that I can read Oracle LOBs in results.
Description
Oracle LOBs are used to store geospatial data. They have to be explicitly read to be able to use them. The
The keyword is currently only available for
copy_rows
but should be usable more widely.Acceptance criteria
fetchone(..., read_lob=True)
worksfetchmany(..., read_lob=True)
worksfetchall(..., read_lob=True)
works