JohannesBuchner / PyMultiNest

Pythonic Bayesian inference and visualization for the MultiNest Nested Sampling Algorithm and PyCuba's cubature algorithms.
http://johannesbuchner.github.io/PyMultiNest/
Other
191 stars 87 forks source link

General query about running pymultinest on Apple M1, and resulting processing speed #222

Closed DavidMoiseNataf closed 1 year ago

DavidMoiseNataf commented 1 year ago

I have recently shifted from using a macbook with a 3.1 Ghz Intel Core i5-7267U that runs at 3.1 Ghz to an Apple M1 Max which runs at 3.2 Ghz. A quick google search reveals that there should be a factor ~2x improvement in performance for single CPU processes:

https://www.cpu-monkey.com/en/compare_cpu-apple_m1-vs-intel_core_i5_7267u

I'm noticing a ~12% reduction in processing time per CPU for the application that I'm using, which is mostly a multinest/pymultinest wrapper. I'm wondering if this is to be expected, or if it indicates that I somehow cheated on the installation somewhere. Do any others have experience in this area?

The amount of RAM used for the same application increased from 750 MB to 2GB, and the number of threads used increased from 19 to 33.

JohannesBuchner commented 1 year ago

Can you help with https://github.com/JohannesBuchner/PyMultiNest/issues/214 ?

JohannesBuchner commented 1 year ago

Probably you need to profile your application to understand the changes. Different library versions may also work differently (e.g. numpy parallelisation). Maybe try setting OMP_NUM_THREADS=1 to avoid parallelisation in the comparison. For the mem usage I don't know, depends on the application.

DavidMoiseNataf commented 1 year ago

The application is Tim Morton's Isochrone code: https://github.com/timothydmorton/isochrones

It might be due to that application (which is mostly a wrapper around pyMultinest to fit for the observed properties of stars) but I think it's more likely to be due to pymultinest or simply multinest.

Can you suggest a very simple code that would isolate the testing of pymultinest? What is OMP_NUM_THREADS=1 ? Where can I find it?

Thank you.

JohannesBuchner commented 1 year ago

https://www.openmp.org/spec-html/5.0/openmpse50.html

https://stackoverflow.com/questions/30791550/limit-number-of-threads-in-numpy

https://github.com/JohannesBuchner/PyMultiNest/blob/master/pymultinest_demo.py

DavidMoiseNataf commented 1 year ago

Thank you. I ran the pymultinest_demo on both machines, where in both cases it takes about 1.5 seconds of CPU time, and is approximately ~15% faster on the M1 than on the Intel Core i5-7267U. Setting the OMP_NUM_THREADS=1 had no impact.

It should be a somewhat faster and I am not sure what the limiting factor is. Or maybe the extra processing power has no impact for this particular application. Please let me know if you have any other suggestions, or if you hear of any similar tests. Thank you.