Closed andreufont closed 3 years ago
Hi Andreu, I have never encountered something like that before. As you said, I suspect something might be wrong with the local cluster.
I've never encountered that exact error but mysterious similar errors often result when then cluster version of python or h5py were updated compiler without updating the underlying hdf5, or vice versa when hdf5 was recompiled with a new compiler and h5py was not updated. Definitely something to bring up with your cluster admin.
Hi @sbird, I will try my luck with the help desk.
For the record, I was able to isolate the problem even further. It only happens when I use the Sherwood simulations, since the other snapshots I have are BigFile. It happens whenever I load the snapshot from Spectra, even if it is only to access the header, as in:
from fake_spectra import griddedspectra as gs
sim_dir='/data/desi/common/HydroData/Sherwood/planck1_80_1024/'
snap_num=9
pixel_res=10.0
spec=gs.GriddedSpectra(snap_num,sim_dir,res=pixel_res,reload_file=True)
It looks like the code reads the snapshot data properly (redshift, box size...), but the code crashes when deleting the snapshot class. The code ignores an exception in AbstractSnapshot.del() pointing again that the problem is related to closing the HDF5 file.
Even more strange, if I directly construct a snapshot object outside of Spectra, it works just fine:
from fake_spectra import abstractsnapshot as absn
sim_dir='/data/desi/common/HydroData/Sherwood/planck1_80_1024/'
snap_num=9
#snap = absn.AbstractSnapshotFactory(snap_num, sim_dir)
snap = absn.HDF5Snapshot(snap_num, sim_dir)
# read number of particles from snapshot
npart = snap.get_npart()
print('npart',npart)
# read box size from header
box = snap.get_header_attr("BoxSize")
print('box',box)
Before contacting the help desk, I copied one of the snapshots from the Sherwood simulations to my laptop, and it also crashes there. I tried to run the code on Hypatia, and it also crashes there. I didn't notice this before because only with the Sherwood sims I use HDF5.
I can live with this issue, since the skewers are actually written to disk, it is only when exiting that I get the error. Or when doing a deepcopy of a spectra class, but I can write a work around.
All that function is doing is to close the file! I don't know what we can do to fix it. I think it is possibly becoming unhappy when it is closed while some part of the IO is still live (this might be an h5py bug). You might just not be able to make a deepcopy of the h5py-containing object. If it worries you, you could also just delete the close() call: h5py will try to close the file when it goes out of scope using the normal python garbage collector anyway.
It doesn't bother me anymore, since I wrote a work-around to avoid using deepcopy.
Happy to leave it as it is, but I thought I'd add more documentation in case someone else finds the same issue in the future.
I'm assuming most of the time you are using BigFile nowadays, right?
Hello,
I have used fake_spectra for Illustris/TNG snapshots which are in hdf5, but never tried to deepcopy the spectra class. I have just generated the spectra and recorded them on the file. In those cases, it was working properly.
On Thu, Nov 12, 2020 at 8:22 AM Andreu Font-Ribera notifications@github.com wrote:
It doesn't bother me anymore, since I wrote a work-around to avoid using deepcopy.
Happy to leave it as it is, but I thought I'd add more documentation in case someone else finds the same issue in the future.
I'm assuming most of the time you are using BigFile nowadays, right?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/sbird/fake_spectra/issues/50#issuecomment-726182693, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKSFRZ6ARJVN7QWO6S7OGK3SPQDSZANCNFSM4TSJVHZQ .
It is very likely also h5py version dependent. If you want to write a doc patch I would be happy to take it.
The code runs fine in my laptop (MacOS), but when I run in the cluster in Barcelona I get an error when trying to close the HDF5 file. This is what I get when I run a test script (see below) that reloads a few spectra from the snapshot:
Here is the script:
Note that it also crashes similarly when loading from savefile.
It is a bit annoying because the HDF5 file stays open, and one can not make a deepcopy of the spectra class since you can not copy HDF5 files.
I had h5py version 2.10, but the issue remains even after updating to 3.1.
Has anyone found something like this? It might be something fishy with my local cluster... In which case I'll contact the help desk.