LSSTDESC / gcr-catalogs

A Python module that provides a unified interface to access mock galaxy catalogs and more for the LSST DESC
https://github.com/LSSTDESC/gcr-catalogs
BSD 3-Clause "New" or "Revised" License
25 stars 20 forks source link

Use NERSC DVS mount of Global? #631

Open sidneymau opened 6 months ago

sidneymau commented 6 months ago

Per https://docs.nersc.gov/performance/io/dvs/, it is probably preferred to access catalogs through /dvs_ro rather than /global. I believe this can be simply changed in the site config:

/dvs_ro/cfs/cdirs/lsst/shared

Note that this will only work for reading data, though I think that restriction should be fine for GCR. Another detail is that the DVS does not support file locking, which is used by HDF5 by default. This can be avoided by setting the following environment variable

export HDF5_USE_FILE_LOCKING=FALSE

Though there may be an option to set locking to False when making the HDF5 reader in python. Would have to do a few tests to figure out the most sensible solution (probably just os.environ["HDF5_USE_FILE_LOCKING"] = "FALSE" in one of the readers)

DVS also doesn't support memory mapping—I would be interested to know if this negatively impacts performance for how people are using GCR to read HDF5 files or if it's not a problem before submitting a change

sidneymau commented 6 months ago

(This just occurred to me after making the DVS slide for the NERSC Sprint :sweat_smile:)

sidneymau commented 6 months ago
hdf = h5py.File(filename, locking=False)

works for reading hdf5 files over dvs