pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
30.66k stars 1.99k forks source link

Globbing support for multiple JSON (not ndjson) files? #12910

Open indigoviolet opened 12 months ago

indigoviolet commented 12 months ago

Description

  1. 6638 added support to scan multiple JSONL files, but not for multiple JSON files AFAICT. This would also be super useful.

  2. While reading multiple files, it would be helpful in general to (optionally) have an automatic/virtual column of file metadata associated with each row, so we can tell which row came from where. The most useful metadata would be path & modified time IMO. (Duckdb, clickhouse can both add the filename, neither can do modified time).

wasimsandhu commented 3 weeks ago

+1