Open ericemc3 opened 6 months ago
Thanks a lot for the bug report.
This do not reproduces in duckdb CLI, where in both cases 7.5MB go through the network as per EXPLAIN ANALYZE
.
This is a problem specific to the duckdb-wasm implementation of get requests, needs to be solved there. It's pretty bad since the multiplier can be even worse.
@szarnyasg: can you move it to duckdb-wasm repository?
Thanks @carlopi for chiming in. I moved the issue.
Related "discussions" about fetched data amount:
What happens?
My remote parquet file weighs 7,2 Mo. If i read it with a simple WHERE, more than 15 Mo pass through the network.
To Reproduce
CREATE OR REPLACE TABLE t AS FROM 'https://static.data.gouv.fr/resources/tables-aufilduboamp-2024/20240113-061700/boamp-panorama-2024-parquet-integral.parquet' ;
=>7,2 Mo (Chrome devtools network inspector)15,6 Mo
OS:
Win11
DuckDB Version:
9.2
DuckDB Client:
shell wasm or cli
Full Name:
eric mauviere
Affiliation:
icem7
Have you tried this on the latest
main
branch?I have tested with a main build
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?