0xcaff / duckdb_protobuf

a duckdb extension for querying encoded protobuf messages
12 stars 2 forks source link

allow for passing list of globs to files #21

Open 0xcaff opened 1 week ago

0xcaff commented 1 week ago

I'd like to be able to pass lists of values to files argument.

SELECT *
FROM protobuf(
    descriptors = './descriptor.pb',
    files = ['./scrape/data/SceneVersion/**/*.bin', './other/location/data/SceneVersion/**/*.bin'],
    message_type = 'test_server.v1.GetUserSceneVersionResponse',
    delimiter = 'BigEndianFixed'
)
LIMIT 10;

I would need to change this value here

https://github.com/0xcaff/duckdb_protobuf/blob/8e22da4da3575c68c9340be20a2d1f66b63531ad/packages/duckdb_protobuf/src/vtab.rs#L109-L111

It seems like the CSV plugin does this same thing here

https://github.com/duckdb/duckdb/blob/3550708a8456e193e5d8ccc21a3dd6563e8a0c21/src/execution/operator/csv_scanner/util/csv_reader_options.cpp#L302-L304

using this value

https://github.com/duckdb/duckdb/blob/3ed5d833dd95556c0a4f88175f4bf9722f978b9e/src/function/table/read_csv.cpp#L254

unfortunately, it seems like the C-API only exposes a subset of types

https://github.com/duckdb/duckdb/blob/05176cd88e42c33a2a039bdfa5da798e0f5266ff/src/main/capi/logical_types-c.cpp#L25-L27

not including any

https://github.com/duckdb/duckdb/blob/c1e3416ad904855c925b0ecc8a16b2f958e9bf77/src/main/capi/helper-c.cpp#L5-L61

https://github.com/duckdb/duckdb/blob/4aa0fb8a4968c0a5da7addda7cd2784f7420485f/src/include/duckdb/common/types.hpp#L189

I asked in discord to see whether we can get this added

https://discord.com/channels/909674491309850675/1148659944669851849/1282054644096434177

0xcaff commented 6 days ago

looks like libduckdb-sys needs to be upgraded for this to happen. the new any type is in the v1.1.0 release