polydbms / sheetreader-duckdb

MIT License
44 stars 4 forks source link

Binder Error: Inline & dynamic String types not supported yet #57

Open wonb168 opened 1 week ago

wonb168 commented 1 week ago

I read a excel,all sheets are OK except first 1, but I don't find any different

error:

SELECT * FROM sheetreader(
    '/Users/linezone/Downloads/data1024.xlsx',
    sheet_index = 1,
    threads = 16,
    skip_rows = 0,
    has_header = true
);

SQL ERROR: java.sql.SQLException: Binder Error: Inline & dynamic String types not supported yet

why? thanks! [Uploading address.xlsx…]()

freddie-freeloader commented 1 week ago

Hey @wonb168,

I was never able to produce an XLSX with inline strings, so couldn't add proper support to it. :smile: It looks like you tried to upload the XLSX, but it didn't work. Could you maybe try again -- if possible, as a minimal example?

wonb168 commented 1 week ago

I uploaded the Excel,please see the attachment

l1t1 commented 1 week ago

this xlsx file


import pandas as pd

df=pd.DataFrame({'a':range(10)})

df.to_excel('ten.xlsx',index=0)
select * from sheetreader('ten.xlsx',has_header = true);
Binder Error: Inline & dynamic String types not supported yet
l1t1 commented 1 week ago
import polars as pl
df = pl.DataFrame({"foo": [1, 2, 3], "bar": [None, "bak", "baz"]})
df.write_excel("pl.xlsx")
D select * from sheetreader('pl.xlsx',has_header = true);
┌────────────┬──────┐
│    foo     │ bar  │
│    date    │ date │
├────────────┼──────┤
│ 1900-01-01 │      │
│ 1900-01-02 │      │
│ 1900-01-03 │      │
└────────────┴──────┘
select * from sheetreader('pl.xlsx',has_header = true,types=[DOUBLE,VARCHAR]);
Binder Error: User defined type for column with index 0 is not compatible with actual type
l1t1 commented 1 week ago
df_problem = pd.DataFrame(
    data={"id": ["id1", "id2", "id3"], "name": ["nm1", "nm2", "nm3"]},
    index=[1, 2, 3],
)

df_problem.to_excel("pl2.xlsx")

read from number columns are all empty

select * from sheetreader('pl2.xlsx');
┌─────────┬─────────┬──────────┐
│ String0 │ String1 │ Numeric2 │
│ varchar │ varchar │  double  │
├─────────┼─────────┼──────────┤
│         │ id      │          │
│         │ id1     │          │
│         │ id2     │          │
│         │ id3     │          │
└─────────┴─────────┴──────────┘

fixed with skip_rows=1,coerce_to_string

D select * from sheetreader('pl2.xlsx',has_header=TRUE,coerce_to_string=TRUE);
Binder Error: First row must contain only strings when has_header is set to true
D select * from sheetreader('pl2.xlsx',skip_rows=1,coerce_to_string=TRUE);
┌──────────┬─────────┬─────────┐
│ Numeric0 │ String1 │ String2 │
│  double  │ varchar │ varchar │
├──────────┼─────────┼─────────┤
│      1.0 │ id1     │ nm1     │
│      2.0 │ id2     │ nm2     │
│      3.0 │ id3     │ nm3     │
└──────────┴─────────┴─────────┘