Closed thomaswilliamsastro closed 6 months ago
Same issue for me on Mac M2 Max chip
@seanlinden @thomaswilliamsastro Thanks for reporting this. We have made several changes to the way multiprocessing works in the pipeline since this issue was submitted, including some bug fixes that should have solved this. These were released with pipeline version 1.14.0 on 29 March. We have added a documentation page here (thanks to @penaguerrero) that gives best-practice recommendations for using multiprocessing. That page includes an example that is very similar to what you are trying to do (example 2). I just verified that the documented example works on my local machine to process multiple uncal files at the same time, running MacOS Ventura 13.6.3 with Apple M2 Max chips on Python 3.11 with jwst==1.14.0.
While we recommend following the new docs, this modified version of your script also works fine on my machine:
import gc
import glob
import multiprocessing as mp
from jwst.pipeline import calwebb_detector1
def run_lv1(uncal_file):
config = calwebb_detector1.Detector1Pipeline.get_config_from_reference(
uncal_file
)
detector1 = calwebb_detector1.Detector1Pipeline.from_config_section(config)
# Run the level 1 pipeline
detector1.run(uncal_file)
if __name__ == "__main__":
input_files = glob.glob("*uncal.fits")
procs = 6
# This one no longer crashes
with mp.get_context("fork").Pool(procs) as pool:
successes = []
for success in pool.imap_unordered(
run_lv1,
input_files,
):
successes.append(success)
pool.close()
pool.join()
gc.collect()
Please try using the newly documented example and the latest pipeline version.
Thanks @emolter !
I'm going to close this issue as the above changes and suggested new script appears to address the issue. Please feel free to re-open the issue if the changes do not solve the issue (or if there are other problems).
I'm having an issue with running the jump step with multiprocessing on my Macbook Pro (M1 chip). This doesn't seem to be an issue on a linux machine, so I suspect this is a Mac-only issue.
The problem seems to be in opencv trying to do its own multiprocessing under the hood, which doesn't work if you're already using MP. Minimally reproducible example that crashes:
The for loop works fine, the mp.pool() crashes. This is fixed by adding
at the top of
jump.py
. So I suspect some kind of test to catch if we're in some kind of MP and if so setting the cv NumThreads to 1 is a solution here