desihub / desidatamodel

The DESI data model.
BSD 3-Clause "New" or "Revised" License
4 stars 10 forks source link

installed desidatamodel drops data/column_descriptions.csv #197

Open sbailey opened 1 week ago

sbailey commented 1 week ago

desidatamodel/py/desidatamodel/data/column_descriptions.csv is useful with desiutil/bin/annotate_fits to update the headers of pre-existing files. This works from git clones, but installed versions of desidatamodel drop this file. Update setup.cfg to also install this data file so that we can use it out of tagged versions, either directly or via importlib.resources.

weaverba137 commented 1 week ago

Sorry, which installed version has this issue?

weaverba137 commented 1 week ago

I can see where the problem is, but we just use desidatamodel/main at NERSC. What is the specific use case for creating at tag today?

sbailey commented 1 week ago

I needed to post-facto add units and column descriptions to kibo/zcatalog/v1/*.fits files using annotate_fits and the latest column_descriptions.csv file from PR #196 . My first instinct was that all production files should only be generated using tagged code/files, so I tagged desidatamodel 24.9 and installed it, expecting to be able to use the column_descriptions.csv file from that installed tag when running annotate_fits. That's when I realized that that file is dropped from installations. I used the version in main instead, which is currently identical, and we can document that the update was equivalent to using the file in the 24.9 tag.

So I functionally have what I needed for today, but in the future it would be nice for desidatamodel installations to also include that data file and not require users of it to run out of a git clone. e.g. in the future desi_zcatalog itself might grab that file as part of augmenting units and column descriptions, and ideally that would use a tagged desidatamodel at the time.

weaverba137 commented 1 week ago

OK, thanks.