Open adrian17 opened 2 days ago
I have a MacBook Pro with 96gb RAM and an Intel processor. The processor is outdated, since apple now has M2/M3 and so on. But otherwise it is a decent size notebook. It ran 10b in less than 2 hours to completion.
Maybe the reason Polars didn't run is because I ran it through Python. Maybe through Rust it would have run. I ran it through Python because is more convenient and I wasn't concerned about the overhead. According to literature the overhead is negligible as long as you don't have other Python code to process the data, for example Python loops.
Question stands: 96GB of RAM doesn't sound like would have handled 240GB of data, unless everything spilled to swap - which also shouldn't happen, as AFAIK swap isn't unbounded on macs.
I answered your question. There is nothing more I can say. I think there is only one way for you to convince yourself
The statement currently doesn't pass the smell test, as three columns with 10b rows, each an 8-byte double adds up to ~240GB. And looking at the implementation, the values are simply stored in a
vector
, so a single buffer without any tricks. Can you please either confirm that this number is correct (and explain how this worked, while other libraries didn't manage to work), or fix it.In a future benchmark, how about once you find a set size where all three libraries work, you also measure and write down peak memory used by the process with standard OS tools? (on my linux box, I used
/usr/bin/time -v
, but there are probably more ways to do that)