simonsobs / pixell

A rectangular pixel map manipulation and harmonic analysis library derived from Sigurd Naess' enlib.
Other
42 stars 32 forks source link

enmap.write_fits can crash if multiple processes write to the same file at roughly the same time #252

Open zatkins2 opened 8 months ago

zatkins2 commented 8 months ago

Description

enmap.write_fits ends with a call to hdus.writeto(fname, overwrite=True). In the case that multiple processes use enmap.write_fits to write to the same file at the same time, hdus.writeto(fname, overwrite=True) can crash because internally it does many operations on the file, including removing it from the filesystem. If one process removes the file while another tries to call some operation on the file, the second process will throw a FileNotFoundError. This is more of an edge case, but I've encountered it a few times in embarrassingly parallel scripts that each e.g. write a mask to disk.

I don't think it's really the responsibility of pixell to handle this particular bug, it's more an issue with astropy. But I'm documenting it here in case others encounter it, and to propose one change to pixell: bubble-up the overwrite kwarg all the way through to enmap.write_map so that users could set it to be False.

Example

from pixell import enmap
import multiprocessing as mp

e = enmap.enmap(np.arange(25).reshape(5, 5))

def f(*args):
    enmap.write_map('/home/zatkins/e.fits', e)

with mp.Pool(5) as p:
    p.map(f, [1, 2, 3])

---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/zatkins/.conda/envs/pspy-della8/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/zatkins/.conda/envs/pspy-della8/lib/python3.10/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "<ipython-input-14-07b695cb02af>", line 2, in f
    enmap.write_map('/home/zatkins/e.fits', e)
  File "/home/zatkins/.conda/envs/pspy-della8/lib/python3.10/site-packages/pixell/enmap.py", line 2292, in write_map
    write_fits(fname, emap, extra=extra, allow_modify=allow_modify)
  File "/home/zatkins/.conda/envs/pspy-della8/lib/python3.10/site-packages/pixell/enmap.py", line 2390, in write_fits
    hdus.writeto(fname, overwrite=True)
  File "/home/zatkins/.conda/envs/pspy-della8/lib/python3.10/site-packages/astropy/io/fits/hdu/hdulist.py", line 1010, in writeto
    fileobj = _File(fileobj, mode=mode, overwrite=overwrite)
  File "/home/zatkins/.conda/envs/pspy-della8/lib/python3.10/site-packages/astropy/io/fits/file.py", line 217, in __init__
    self._open_filename(fileobj, mode, overwrite)
  File "/home/zatkins/.conda/envs/pspy-della8/lib/python3.10/site-packages/astropy/io/fits/file.py", line 615, in _open_filename
    self._overwrite_existing(overwrite, None, True)
  File "/home/zatkins/.conda/envs/pspy-della8/lib/python3.10/site-packages/astropy/io/fits/file.py", line 488, in _overwrite_existing
    os.path.exists(self.name) and os.path.getsize(self.name) != 0
  File "/home/zatkins/.conda/envs/pspy-della8/lib/python3.10/genericpath.py", line 50, in getsize
    return os.stat(filename).st_size
FileNotFoundError: [Errno 2] No such file or directory: '/home/zatkins/e.fits'
"""
msyriac commented 4 months ago

Don't we want such attempts to fail with an error? It's the user's responsibility to make sure multiple processes don't write to the same file no? Maybe I'm missing the point of your recommendation. What is achieved exactly if the user can set overwrite to False?