duckdb / duckdb_spatial

MIT License
492 stars 41 forks source link

Using "Spatial" extension to read excel is giving error with versions 0.9.0 and above. It throws "NotImplementedException: Not implemented Error: GDAL Error (6): Adding too many columns to too many existing features". The same code worked well with duck db0.8.1 #230

Open debottamroychowdhury opened 10 months ago

debottamroychowdhury commented 10 months ago

What happens?

Reading this line gives me below error with latest versions of duck db

NotImplementedException: Not implemented Error: GDAL Error (6): Adding too many columns to too many existing features

To Reproduce

CREATE OR REPLACE TABLE 'DummyTable' AS SELECT * FROM ST_Read('file_name', layer='Sheet1', open_options=['HEADER=FORCE'])

OS:

Windows

DuckDB Version:

0.9.0

DuckDB Client:

Python 3.10

Full Name:

Debottam

Affiliation:

Personal

Have you tried this on the latest main branch?

I have tested with a main build

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

Maxxen commented 10 months ago

Hi! Thanks for filing this issue. Does the same problem occur if you the latest nightly version of spatial? You can get it for DuckDB v0.9.2 by running:

FORCE INSTALL spatial FROM 'http://nightly-extensions.duckdb.org';
debottamroychowdhury commented 10 months ago

the same problem occur if you the latest nightly version of spatial? You can get it for DuckDB v0.9.2 by running:

Yes, tried to do FORCE INSTALL spatial FROM 'http://nightly-extensions.duckdb.org'; in the code, but same issue still persists.

Maxxen commented 10 months ago

Could you share the excel file? You can email me at max@duckdblabs.com if you don't want to share it publicly.

debottamroychowdhury commented 10 months ago

Could you share the excel file? You can email me at max@duckdblabs.com if you don't want to share it publicly.

I think you have just identified the actual issue. I just discovered the same excel file which worked well with same code in duck db 0.8.1 Is not working with duck db 0.9.x. I copied the columns of the excel file to a new excel and then it worked well in 0.9.x also.

The reason for this issue is clear now, in my excel, we had data in all rows of the columns (A to D) + there were some data in few rows towards the end of the sheet (for column G to column H). Such an excel is not working and is throwing this error with Duck DB 0.9.x versions. Please note, this same excel worked well with 0.8.1

debottamroychowdhury commented 10 months ago

Could you share the excel file? You can email me at max@duckdblabs.com if you don't want to share it publicly.

Kindly also note, if we put the data of column G to H towards the start of the excel (randomly say from row 6 onwards), then it works well. The issue only occurs when such data is towards the end of the sheet.