Open EiffL opened 2 years ago
In order to use the shh/datasets/tng100
dataset:
from sfh.datasets.tng100 import tng100
tng100 implements interpolation for sfh dset - no need to implement the simpler sfh dset. However, tng100 does not have the "Mask" tensor (which is in sfh_interp), which flags 1 if the value is real and 0 if it's interpolated. Not sure that "Mask" is really necessary though.
Ok then we can proceed most likely to just removing the datasets that are not necessary, and only keep sfh/datasets/tng100
. @yannick1974 if you want to open a PR for this cleaning, feel free to do so :-)
I propose this to clean things up:
datasets
directory with the tensorflow datasets and the code generating them. It will be a Python module to import the datasets from.code
directory. I'm not sure here but I tend to prefer having large bunch of code separated from the notebook and imported into them. Maybe it's not useful here, feel free to cry out.notebooks
directory with all the notebooksAlso, I'd like to use environment variable for the location of data files, using the IDRIS variables when appropriate. This will make it easier to run the notebooks outside of Jean-Zay.
What do you all think about that?
Yep that sounds good!
The only thing I usually do is that the dataset directory is a submodule of the code directory, which is itself a pip installable module.
I can do like that too.
Main dataset: TNG100
[x] sfh/datasets/mergers -> Provided kinetic maps + noiseless and observed images combined with time to last merger | obsolete by the sfh/datasets/tng100 (@ppfn )
[ ] sfh/datasets/sfh -> Uncondition time domain data | obsolete by the sfh/datasets/tng100 | Check that the processing is the same (@16Aghnar @laurilaatu )
[ ] sfh/datasets/sfh_interp -> Uncondition time domain data (with different strategy for missing data) | to be checked
[ ] sfh/datasets/sfhsed/sfhsed -> Contains SED + quantile estimation of history | obsoleted by sfh/datasets/tng100 (ask @aghribi to check)
The proposal is to check that all the data we care about is in TNG100, and move the other datasets into a legacy folder.