hzovaro / spaxelsleuth

A package for analysing data from large integral field unit surveys such as the SAMI and Hector Galaxy Surveys.
MIT License
1 stars 1 forks source link

settings doesn't consistently update from the user configuration file #8

Closed hzovaro closed 1 year ago

hzovaro commented 1 year ago

When running the script on my Mac

if __name__ == "__main__":
    from spaxelsleuth import load_user_config
    load_user_config("/Users/u5708159/Desktop/spaxelsleuth_test/.myconfig.json")
    from spaxelsleuth.loaddata.sami import make_sami_metadata_df, make_sami_df, load_sami_metadata_df, load_sami_df

    nthreads = 4
    DEBUG = True

    make_sami_df(bin_type="default", 
                ncomponents="recom", 
                eline_SNR_min=5, 
                correct_extinction=True,
                metallicity_diagnostics=["R23_KK04"],
                nthreads=nthreads,
                debug=DEBUG)

I get the error

utils.py (160) _init_num_threads(): INFO: NumExpr defaulting to 8 threads.
utils.py (160) _init_num_threads(): INFO: NumExpr defaulting to 8 threads.
utils.py (160) _init_num_threads(): INFO: NumExpr defaulting to 8 threads.
utils.py (160) _init_num_threads(): INFO: NumExpr defaulting to 8 threads.
---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Users/u5708159/opt/anaconda3/envs/spaxelsleuth/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/Users/u5708159/opt/anaconda3/envs/spaxelsleuth/lib/python3.10/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/Users/u5708159/Software/python/modules/spaxelsleuth/spaxelsleuth/loaddata/sami.py", line 580, in _process_gals
    with fits.open(data_cube_path / f"ifs/{gal}/{gal}_A_cube_blue.fits.gz") as hdulist_B_cube:
  File "/Users/u5708159/opt/anaconda3/envs/spaxelsleuth/lib/python3.10/site-packages/astropy/io/fits/hdu/hdulist.py", line 213, in fitsopen
    return HDUList.fromfile(
  File "/Users/u5708159/opt/anaconda3/envs/spaxelsleuth/lib/python3.10/site-packages/astropy/io/fits/hdu/hdulist.py", line 476, in fromfile
    return cls._readfrom(
  File "/Users/u5708159/opt/anaconda3/envs/spaxelsleuth/lib/python3.10/site-packages/astropy/io/fits/hdu/hdulist.py", line 1146, in _readfrom
    fileobj = _File(
  File "/Users/u5708159/opt/anaconda3/envs/spaxelsleuth/lib/python3.10/site-packages/astropy/io/fits/file.py", line 217, in __init__
    self._open_filename(fileobj, mode, overwrite)
  File "/Users/u5708159/opt/anaconda3/envs/spaxelsleuth/lib/python3.10/site-packages/astropy/io/fits/file.py", line 626, in _open_filename
    self._file = open(self.name, IO_FITS_MODES[mode])
FileNotFoundError: [Errno 2] No such file or directory: 'sami/input/ifs/7139/7139_A_cube_blue.fits.gz'
"""

The above exception was the direct cause of the following exception:

FileNotFoundError                         Traceback (most recent call last)
File ~/Software/python/modules/spaxelsleuth/scripts/basic_sami_script.py:12
      9 # Create the DataFrames
     10 make_sami_metadata_df(recompute_continuum_SNRs=True, nthreads=nthreads)
---> 12 make_sami_df(bin_type="default", 
     13             ncomponents="recom", 
     14             eline_SNR_min=5, 
     15             correct_extinction=True,
     16             metallicity_diagnostics=["R23_KK04"],
     17             nthreads=nthreads,
     18             debug=DEBUG)

File ~/Software/python/modules/spaxelsleuth/spaxelsleuth/loaddata/sami.py:1324, in make_sami_df(bin_type, ncomponents, eline_SNR_min, correct_extinction, sigma_gas_SNR_min, eline_ANR_min, eline_list, line_flux_SNR_cut, missing_fluxes_cut, line_amplitude_SNR_cut, flux_fraction_cut, sigma_gas_SNR_cut, vgrad_cut, stekin_cut, metallicity_diagnostics, nthreads, debug, __use_lzifu_fits, __lzifu_ncomponents)
   1321 logger.info(f"beginning pool...")
   1322 pool = multiprocessing.Pool(
   1323     min([nthreads, len(gal_ids_dq_cut)]))
-> 1324 res_list = np.array((pool.map(_process_gals, args_list)), dtype=object)
   1325 pool.close()
   1326 pool.join()

File ~/opt/anaconda3/envs/spaxelsleuth/lib/python3.10/multiprocessing/pool.py:367, in Pool.map(self, func, iterable, chunksize)
    362 def map(self, func, iterable, chunksize=None):
    363     '''
    364     Apply `func` to each element in `iterable`, collecting the results
    365     in a list that is returned.
    366     '''
--> 367     return self._map_async(func, iterable, mapstar, chunksize).get()

File ~/opt/anaconda3/envs/spaxelsleuth/lib/python3.10/multiprocessing/pool.py:774, in ApplyResult.get(self, timeout)
    772     return self._value
    773 else:
--> 774     raise self._value

FileNotFoundError: [Errno 2] No such file or directory: 'sami/input/ifs/7139/7139_A_cube_blue.fits.gz'

It seems that settings isn't being updated somewhere. This problem doesn't seem to occur on the servers.

hzovaro commented 1 year ago

Using

multiprocessing.set_start_method("fork")

appears to fix this. See here: https://www.reddit.com/r/learnpython/comments/g5372v/multiprocessing_with_fork_on_macos/

hzovaro commented 1 year ago

Fixed by adding a function in init.py that does multiprocessing.set_start_method("fork").