Open openmodlogs opened 9 months ago
I should note I am using polars-lts-cpu
and I get the same results with versions 0.20.5
and 0.20.4
I can't reproduce
In [2]: %timeit pl.scan_ndjson('big_test_data.json', n_rows=10).collect()
160 µs ± 829 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [3]: %timeit pl.scan_ndjson('big_test_data.json').collect()
410 ms ± 14.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
polars: 0.20.5 cpu: M1
Checks
Reproducible example
Log output
No response
Issue description
scan_json
is scanning the entire file no matter which arguments are passed, This also happens when using.head(10)
before.collect()
Expected behavior
I expect
scan_ndjson
to "Stop reading from JSON file after reading n_rows" when providing a value forn_rows
. I'd expect the same results when using.head(n)
before.collect
or.fetch(n)
Installed versions