Fix pyspark local execution

coiled / benchmarks

BSD 3-Clause "New" or "Revised" License

28 stars 17 forks source link

Fix pyspark local execution #1492

Closed milesgranger closed 4 months ago

milesgranger commented 6 months ago

I've tried this with scale 5, 10, and 100 scale data w/o trouble after verifying the current state is broken.

My machine has 62G so the logic gives ~23G to executors, and have tried capping that to 5G per executor and it still worked.

I somewhat arbitrarily assigned n_executors = 2, could very well be a single executor or something else. I'm not sure what might be preferred here by others.

fjetter commented 6 months ago

I'm running on a M1 mac book and when I close most of my browser tabs and my IDE I have about 9.4GiB of available memory. When I initially executed this script, I essentially had almost no available memory and the total_executor_memory_g math produced negative values for me. d)

fjetter commented 6 months ago

Regardless of the settings. I tried a couple of different configs and this doesn't work for me. Even on scale1 I have a job that appears to be stuck

I get that spark may be slow but I don't believe that it can't handle 1GB locally.

fjetter commented 6 months ago

Also, I need to add .config("spark.driver.bindAddress", "127.0.0.1") (see https://github.com/coiled/benchmarks/pull/1490) to make this work for me. I suspect this is an OSX thing

milesgranger commented 5 months ago

Ya, I'm slightly baffled. I was concentrating on scale 100 to get it running for the comparison. But now am experiencing a similar scenario on scales 1, 5, 10. Queries will fail, while on scale 100 all work fine. :face_exhaling:

2024-04-03 10:12:43,053 - distributed.nanny.memory - WARNING - Worker tcp://127.0.0.1:35463 (pid=148634) exceeded 95% memory budget. Restarting...

This occurs on the lower scales for me, but again, not on scale 100.

hendrikmakait commented 4 months ago

Superseded by #1505