a-paranjape / sahyadri-sandbox

Sandbox for testing codes and scripts related to Sahyadri simulations at IUCAA/TIFR/IISER-Pune/NCRA
0 stars 0 forks source link

Halo catalogue conversion to fits and parse tree for minimal information. #3

Closed shadaba closed 2 months ago

shadaba commented 6 months ago

We would like to store the halo catalogue in minimal information and use fits.gz to reduce storage. One of the main task is to understand the merger tree output and make sure we keep all the information with minimum storage requirement. We should also keep in mind that this should be stored such that getting information about proginator is effiecient by thinking on using row indices.

a-paranjape commented 2 months ago

@shadaba can you pls list the columns saved in the _basic and _extended fits files in the data_compression branch?

Also, will the prep_halos method always only load the _basic file? What about loading the _extended file?

shadaba commented 2 months ago

The list of quantity in basic is defined here: https://github.com/a-paranjape/sahyadri-sandbox/blob/data_compression/scripts/post-process/ConvertToFITS.py#L49C5-L50C55

which is: basic_cols=['x','y','z','vx','vy','vz','Mvir','Mvir_all','M200b','M200c','M500c','M2500c', 'rvir','rs','vrms','id','pid','T/|U|']

All other quantity is kept in the extended file. These are typically used quantity in most cases but can be changed as we see fit.

Current prep_halo fits function only load basic. If needed we can add similar key to va to load the extended file in this function. Since for the current test this was not needed and hence haven't been added. But given there is large number of quantity we might want to provide a function where user can provide list of quantities to load and all relevant files like basic, extended, vahc is used to load needed property.

Let me know if you would like me to already implement an extended option similar to va, or load the extended properties by default.

a-paranjape commented 2 months ago

thanks. i think a new keyword, say `ext', with default value False would be good to have. then the complete information can be accessed when needed.

later we can update as you suggest so that the user specifies which columns are needed, but for now i think this is not necessary.

shadaba commented 2 months ago

Now a keyword is added ext=False which if true the load all the extended properties. The notebook SahyadriCompression.ipynb is also updated to reflect these changes for testing.

a-paranjape commented 2 months ago

merged branch data_compression with master