CMB-S4 / serverless-data-portal-cmb-s4

Proof-of-concept for a data portal using static pages.
https://data.cmb-s4.org/
BSD 3-Clause "New" or "Revised" License
0 stars 2 forks source link

Public data at NERSC #31

Open zonca opened 8 months ago

zonca commented 8 months ago

As part of #28, I requested to have anonymous HTTPS access for our public datasets and now it is active, therefore we can move the public Planck PR4 data from UCSD to NERSC.

The release is on the portal at https://data.cmb-s4.org/planck_pr4.html, in total it is about 300 GB.

My first proposal would be to just duplicate the data under:

/global/cfs/cdirs/cmbs4/gsharing

I think we have those files under the CMB account instead of the S4 account, so using symlinks or hardlinks will probably not work due to permissions. @jdborrill do you agree?

jdborrill commented 8 months ago

For CFS space-management purposes, all public data (currently Planck and PanEx/PySM, but should be extended to include eg. ACT DR6) should be in the cmb CFS space; all private CMB-S4 data should be in the cmbs4 CFS space. When CMB-S4 data go public, they should be moved to the cmb space.

We don't have the space to duplicate things, but we can leave a symlink from the cmbs4 to the cmb CFS space when data are moved.

We need to review and clean-up whatever is currently in the cmb space; what is left should all be made available through the portal so we don't have to give out accounts just to copy data elsewhere.

Julian

On Mon, Nov 6, 2023 at 12:58 PM Andrea Zonca @.***> wrote:

As part of #28 https://github.com/CMB-S4/serverless-data-portal-cmb-s4/issues/28, I requested to have anonymous HTTPS access for our public datasets and now it is active, therefore we can move the public Planck PR4 data from UCSD to NERSC.

The release is on the portal at https://data.cmb-s4.org/planck_pr4.html, in total it is about 300 GB.

My first proposal would be to just duplicate the data under:

/global/cfs/cdirs/cmbs4/gsharing

I think we have those files under the CMB account instead of the S4 account, so using symlinks or hardlinks will probably not work due to permissions. @jdborrill https://github.com/jdborrill do you agree?

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/serverless-data-portal-cmb-s4/issues/31, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSSZU6MX34ZDHYAA4CDYDFFONAVCNFSM6AAAAAA7AECPC2VHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4DAMBUGY2DQNI . You are receiving this because you were mentioned.Message ID: @.***>

zonca commented 8 months ago

ok, so the only way to achieve this would be to enable another Globus guest collection that points to the folder (currently doesn't exist):

 `/global/cfs/cdirs/cmb/gsharing`

and then have the data portal point to the cmb-s4 gsharing for the S4 data and to the cmb gsharing folder for the public data. It is a bit more complicated but manageable.

zonca commented 7 months ago

@jdborrill do agree with creating a separate Globus collection for the CMB CFS? see details above

zonca commented 5 months ago

@jdborrill I think the 2 other panexp v1 skies, i.e. SO and Litebird, could me moved from cmbs4xlitebird and from the simons observatory space into the CMB CFS. Then I'll setup another endpoint in the CMB space so that we can also serve them from S4 data portal. Ok for me to move them and then put symlinks at their previous location?

jdborrill commented 5 months ago

Sounds good to me, yes.

Julian

On Mon, Feb 26, 2024 at 5:59 PM Andrea Zonca @.***> wrote:

@jdborrill https://github.com/jdborrill I think the 2 other panexp v1 skies, i.e. SO and Litebird, could me moved from cmbs4xlitebird and from the simons observatory space into the CMB CFS. Then I'll setup another endpoint in the CMB space so that we can also serve them from S4 data portal. Ok for me to move them and then put symlinks at their previous location?

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/serverless-data-portal-cmb-s4/issues/31#issuecomment-1965652960, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSVGGW75FJJCJMTW7ULYVU4WPAVCNFSM6AAAAAA7AECPC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRVGY2TEOJWGA . You are receiving this because you were mentioned.Message ID: @.***>