Some operations fail with out-of-/tmp-space errors

radio-astro-tools / spectral-cube

Library for reading and analyzing astrophysical spectral data cubes

http://spectral-cube.rtfd.org

BSD 3-Clause "New" or "Revised" License

96 stars 63 forks source link

Some operations fail with out-of-/tmp-space errors #887

Open keflavich opened 1 year ago

keflavich commented 1 year ago

@d-l-walker @ashleythomasbarnes please help fill in details!

The brief version is: running some code in this file: https://github.com/ACES-CMZ/reduction_ACES/blob/main/aces/joint_deconvolution/reproject_mosaic_funcs.py resulted in failures because the /tmp drive got filled up.

This is almost certainly a side effect of dask caching files to /tmp directories.

We need to add documentation about this problem, and/or, do a filesystem size check before dumping things to /tmp

e-koch commented 1 year ago

Do we need to add a kwarg to set the temp directory to write to? Or write to the current directory like CASA?

keflavich commented 1 year ago

I think this is a documentation need first. Writing to the current directory is not a better default option - it depends on the machine & architecture of the storage system. But we should try to prevent writing tmp files larger than the tmp drive - frankly, I think dask should be doing this, but we will lose users if we don't come up with a solution

e-koch commented 1 year ago

Agreed. Looks like it can be set in a .dask/config.yml file, or from the CMD line (https://docs.dask.org/en/latest/configuration.html#yaml-files; https://stackoverflow.com/questions/40042748/how-to-specify-the-directory-that-dask-uses-for-temporary-files)

keflavich commented 8 months ago

@ashleythomasbarnes Could you fill in more details about what you're trying? I think we can come to a solution but we need tracebacks and/or details about what went wrong.

ashleythomasbarnes commented 8 months ago

I'm trying to create a mean spectrum of a large MUSE datacube (~60GB), but this was filling up /tmp on the computer I was using. The code I am using with spectral_cube.__version__ = '0.6.2' is given below. I can see if this is solved using the solutions from @e-koch.

infile = '../data/ngc0628c/muse/NGC0628-0.92asec.fits'
hdu = fits.open(infile)[1]
cube = SpectralCube.read(hdu)
cube.allow_huge_operations=True
spec_mean = cube.mean(axis=(1,2))

keflavich commented 8 months ago

@ashleythomasbarnes Thanks, that's helpful. Could you confirm that the cube is being read as a DaskSpectralCube?

There are a few workarounds for this. Some are to do with dask, as noted above, but another approach is to force a non-dask spectral cube and do spec_mean = cube.mean(axis=(1,2), strategy='slice'), which will do a channel-by-channel mean and therefore only load a small fraction of the cube into memory at any given time.

e-koch commented 8 months ago

Or set the temporary directory to a location that has sufficient storage:

TEMPDIR='mydir' python cube_script.py

ashleythomasbarnes commented 8 months ago

I don't think so @adamginsburg... I'm not explicitly using the use_dask=True when loading, and this is cube I'm using.

SpectralCube with shape=(3761, 1426, 1412):
 n_x:   1412  type_x: RA---TAN  unit_x: deg    range:    24.133237 deg:   24.214712 deg
 n_y:   1426  type_y: DEC--TAN  unit_y: deg    range:    15.741643 deg:   15.820816 deg
 n_s:   3761  type_s: AWAV      unit_s: Angstrom  range:     4700.000 Angstrom:    9400.000 Angstrom

keflavich commented 8 months ago

OK, then there's a different answer here. Try my suggestion, spec_mean = cube.mean(axis=(1,2), how='slice') (note: keyword is how, not strategy).

The other thing you can do is pass the memmap_dir keyword or specify TMPDIR globally to force it to write somewhere else.

keflavich commented 8 months ago

@e-koch real issue here, though: How did Ash hit a case where tempfiles were being used? Tempfiles are only created by the parallel versions of the code, which .mean doesn't access, afaict.
https://github.com/radio-astro-tools/spectral-cube/blob/master/spectral_cube/spectral_cube.py#L2922-L2924

e-koch commented 8 months ago

Is it the memory mapping in astropy.io.fits?

keflavich commented 8 months ago

No, that's not relevant - fits's memory mapping just loads the file on disk, it won't create any new files in temp directories, at least afaik.