amanchokshi / EMBERS

Experimental Measurement of BEam Responses with Satellites
https://embers.readthedocs.io
MIT License
3 stars 1 forks source link

care with the --max_cores= flags #13

Closed teuben closed 3 years ago

teuben commented 3 years ago

I had previously noted that before the --max_cores= was implemented, my laptop ran into the ground, the load was 20+, basically because I saw 8 threads of - in this case - ephem_batch running. After the flag was implemented, I ran it with --max_cores=2 and noted two were running, but each with about 300% CPU. Clearly it seems, if I want to be able to do any other work, --max_cores=1 is my sweet spot. Whatever code runs in a thread, must have parallel code (in numpy?) which is CPU hungry. This also explains the load of 20+ Now running it with 1 core, I see the load going over 400% from time to time. I have 4 cores (2 threads per core), so usually anything over 400% will be bad for the threads. So for me it seems 1 is the best.

amanchokshi commented 3 years ago

A colleague recently pointed out to me that numpy and scipy are intrinsically very efficient and often use a lot of the available resources to speed things up. My parallelizing things on top of that does seem to mean that the batch scripts such as align_batch and ephem_batch can easily use up all available resources. This hadn't been a problem as I used a server in the department when running my full dataset.

I'll add a note to the documentation suggesting using --max_cores=1 if using a laptop. Is that okay?

teuben commented 3 years ago

yes, this would be good. Not many people realize that a lot of laptops that have the temperature sensitive cpus (the intel U series for example) will lower their CPU freq if they get hot. And when running all cores this can happen, and actually most codes will trash the CPU if the threads in a core are uses "as cores", So for CASA my sweet spot is always just to use the cores, and not the threads.

amanchokshi commented 3 years ago

That's really interesting! I'm learning quite a bit though all this analysis into the performance of EMBERS, thanks :)

Have updated the documentation