EthanRosenthal / medium-data-bakeoff

A python library bakeoff for medium sized datasets
MIT License
23 stars 7 forks source link

Updated benchmarks #19

Closed EthanRosenthal closed 1 year ago

EthanRosenthal commented 1 year ago

@sullivancolin I just ran the benchmarks from #18 on my computer. Interestingly, I see duckdb performing significantly better than polars on my machine. I ran this a couple times, and that result was consistent. I did update polars to 0.8.15 since 0.8.14 was yanked. Maybe that accounts for the perf difference? I also had to increase docker shared memory on machine to 10 GB. Modin continues to show poor performance.

sullivancolin commented 1 year ago

Weird. I just rain the main branch in Docker and got similar results to what I showed in the PR thread. I also am using Intel core i7. I have 12 cores with 32 GB of rame.

benchmark_50