Closed hendrikmakait closed 7 months ago
FWIW, the OOM error itself could possibly be solved by re-building DuckDB without jemalloc
(https://github.com/duckdb/duckdb/issues/8135).
I don't think that we want to rebuild duckdb ourselves?
We might want to send them a reproducer though if it reproduces consistently for us
I don't think that we want to rebuild duckdb ourselves?
I'm not saying we should, this was more of a way for me to log a possible related issue (and workaround). I agree that we may want to send them a reproducer if this persists.
Another possible related issue: https://github.com/duckdb/duckdb/issues/3391
Looking at previous runs, this is not perfectly reproducible, but we can usually reproduce it at some point during a scale 1000 run:
Clusters: https://cloud.coiled.io/clusters/380463/information?viewedAccount=%22dask-benchmarks%22&tab=Logs&filterPattern= https://cloud.coiled.io/clusters/380432/information?viewedAccount=%22dask-benchmarks%22&tab=Logs&filterPattern= https://cloud.coiled.io/clusters/379551/information?viewedAccount=%22dask-benchmarks%22&tab=Logs&filterPattern= https://cloud.coiled.io/clusters/377502/information?viewedAccount=%22dask-benchmarks%22&tab=Logs&filterPattern=
Fixed by #1400
While running TPC-H benchmarks at scale 1000 with DuckDB, I've noticed that failures cascade and cause subsequent tests to fail as well.
Cluster: https://cloud.coiled.io/clusters/383513/information?viewedAccount=%22dask-benchmarks%22&tab=Logs&filterPattern=