GlareDB / glaredb

GlareDB: An analytics DBMS for distributed data
https://glaredb.com
GNU Affero General Public License v3.0
535 stars 35 forks source link

Excel Tables #2949

Open tychoish opened 2 weeks ago

tychoish commented 2 weeks ago

Support for Excel tables ? I think more and more data are stored in this structure, and they have many advantages over ranges (being relatively close to a database table, with column title, no merged columns). Tables are named, and AFAIK, their name is unique within a workbook (see for instance, in Apache POI : https://poi.apache.org/apidocs/dev/org/apache/poi/ss/usermodel/Table.html, ExcelJS: https://github.com/exceljs/exceljs?tab=readme-ov-file#tables, .NET importExcel :https://www.powershellgallery.com/packages/ImportExcel/6.5.0/Content/GetExcelTable.ps1 or https://learn.microsoft.com/en-us/dotnet/api/documentformat.openxml.spreadsheet.table?view=openxml-3.0.1 for giving a few example I came across). I don't know about any Rust implementation though (does Calamine support it?). It's frustrating having data in Excel tables, or data loaded into Excel through Power Query which creates Excel tables and not being able to load them by name.

Expected behaviour would :

select * from read_excel('path/to/file.xlxs', table => 'table1') -- Uses the specified table

Hope it's doable without too much effort. Thanks!

Originally posted by @jgranduel in https://github.com/GlareDB/glaredb/issues/1994#issuecomment-2088797993

tychoish commented 2 weeks ago

@jgranduel, thanks for the feature request! I've moved it to this issue for better tracking.

Adding support for Excel Tables may be possible, but I'd need to look into it more before I getting an estimate/timeline together.