LSSTDESC / DC2-production

Configuration, production, validation specifications and tools for the DC2 Data Set.
BSD 3-Clause "New" or "Revised" License
11 stars 7 forks source link

Data transfer and HPSS storage of CS catalogs #394

Open heather999 opened 4 years ago

heather999 commented 4 years ago

As discussed this week at the CS meeting and in issue https://github.com/LSSTDESC/desc-help/issues/11, there is a planned data transfer at NERSC from projecta to CFS. This includes the cosmoDC2 catalogs currently stored under /global/projecta/projectdirs/lsst/groups/CS. The CS group has identified those catalogs that should remain active on CFS (as well as stored to NERSC HPSS) and those that can be copied to HPSS and removed.

Here is the list of catalogs, and their sizes, that will remain available on the CFS:

The list of catalogs that will be copied to NERSC HPSS and removed:

JoanneBogart commented 4 years ago

I believe these directories comprise the whole of /global/projecta/projectdirs/lsst/groups/CS/cosmoDC2. Are they all catalogs? Of the items in the top list, only about half are registered in GCR. Perhaps it's reasonable to call the others catalogs as well; I'd just like to confirm.

heather999 commented 4 years ago

I think we would need to check with @evevkovacs and @yymao to see if all the catalogs that will remain on disk should be referenced in GCR.

yymao commented 4 years ago

No, not all of the subdirectories that CS asks to keep active on CFS need to be made available in GCRCatalogs. Some of these are intermediate data products that we don't expect regular end users to use.

evevkovacs commented 4 years ago

The other directories are mostly auxiliary data that were used to make the catalogs and need to be kept. We can rethink this if need be, but for now I think it makes sense to keep them in the catalogs directory. They are all relevant to past or ongoing work.

katrinheitmann commented 4 years ago

With the new filesystem, has this all been sorted out? If not, what else needs to be done? @heather999 @yymao @evevkovacs. Maybe it's possible to write a very brief conclusion and close this issue? Thanks!

heather999 commented 4 years ago

We still want to back up all the catalogs and those that have been identified for removal can be removed from CFS. There has been so much other work going on reorganizing CFS that I haven't gotten back to backing up the catalogs yet. I want to use this issue to keep track of my progress as things are copied to HPSS.

wmwv commented 3 years ago

@heather999 Can this be marked as done?

heather999 commented 3 years ago

Unfortunately, not yet. Hoping to get this completed over the next couple of weeks.