CMB-S4 / serverless-data-portal-cmb-s4

Proof-of-concept for a data portal using static pages.
https://data.cmb-s4.org/
BSD 3-Clause "New" or "Revised" License
1 stars 2 forks source link

Migration to NERSC #28

Closed zonca closed 11 months ago

zonca commented 1 year ago

We have now a CMB-S4 collaboration endpoint on Perlmutter running Globus 5. "NERSC Perlmutter cmbs4 Collab"

However this works differently than before, instead of sharing only the gsharing folder on the community file system, this directly accesses the cmbs4 home folder.

Opened a ticket asking for help.

zonca commented 1 year ago

Docs seems outdated: https://docs.nersc.gov/services/globus/

rpwagner commented 1 year ago

@zonca I'm at NERSC this week. Let me know if I can help by talking to someone while I'm there.

zonca commented 1 year ago

@rpwagner yes, I think it would be a good opportunity to get in contact with Lisa Gerhardt if you haven't met her already, and you understand better the configuration of Globus we need for the portal.

zonca commented 1 year ago

@jdborrill: today @rpwagner and I had a chat with Lisa Gerhardt and Nick Tyler. there are some security restrictions that affect the data portal:

  1. Guest collections that allow users to manage permissions can only be created by a user account, not a collaboration account. So I should create the guest collection with my account and I will be the only one that can change permissions on the folders. In case someone else gets in charge of the portal, they will have to point the guest collection to the data portal root and reconfigure again permissions, no need to move data. Should I create the guest collection? or do we want to put someone else in charge of permissions? For permission here we mean which Globus groups access which folder. Multiple people can manage membership to the Globus group (which is actually automated now).
  2. UCSD allows files shared via HTTPS to be directly accessible without logging in to Globus. NERSC instead requires to authenticate to any Globus account before downloading the files. To go around this we could keep the public data at UCSD and only keep the protected data at NERSC. Alternatively, Lisa suggested we could ask NERSC to change policy on this, it will take some months but it could be changed.
jdborrill commented 1 year ago

Hi Andrea,

Thanks for this.

  1. I'm happy for you to be in charge of the guest collection, as you suggest.

  2. Can the same portal support data at multiple sites (ie. NERSC and SDSC), or would we need two portals?

Julian

On Thu, Oct 26, 2023 at 12:57 PM Andrea Zonca @.***> wrote:

@jdborrill https://github.com/jdborrill: today @rpwagner https://github.com/rpwagner and I had a chat with Lisa Gerhardt and Nick Tyler. there are some security restrictions that affect the data portal:

  1. Guest collections that allow users to manage permissions can only be created by a user account, not a collaboration account. So I should create the guest collection with my account and I will be the only one that can change permissions on the folders. In case someone else gets in charge of the portal, they will have to point the guest collection to the data portal root and reconfigure again permissions, no need to move data. Should I create the guest collection? or do we want to put someone else in charge of permissions? For permission here we mean which Globus groups access which folder. Multiple people can manage membership to the Globus group (which is actually automated now).
  2. UCSD allows files shared via HTTPS to be directly accessible without logging in to Globus. NERSC instead requires to authenticate to any Globus account before downloading the files. To go around this we could keep the public data at UCSD and only keep the protected data at NERSC. Alternatively, Lisa suggested we could ask NERSC to change policy on this, it will take some months but it could be changed.

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/serverless-data-portal-cmb-s4/issues/28#issuecomment-1781810178, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSRB4HCGC4PXHGMUXGDYBK6A5AVCNFSM6AAAAAA6OU5BLSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBRHAYTAMJXHA . You are receiving this because you were mentioned.Message ID: @.***>

zonca commented 1 year ago

1 portal will point to the 2 different sites, public datasets will point to UCSD, private to NERSC. Users will hardly notice.

jdborrill commented 1 year ago

OK - sounds like a temporary plan while we try to get NERSC to change its policy. In the long term we want everything in one place so we don't need to move things when they become public. In the short term this would mean that the CMB-S4 stuff would be at NERSC and the PySM PanEx skies at UCSD, correct?

J

On Thu, Oct 26, 2023 at 1:57 PM Andrea Zonca @.***> wrote:

1 portal will point to the 2 different sites, public datasets will point to UCSD, private to NERSC. Users will hardly notice.

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/serverless-data-portal-cmb-s4/issues/28#issuecomment-1781884476, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSSQMSOLEOEBCABB2PLYBLFERAVCNFSM6AAAAAA6OU5BLSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBRHA4DINBXGY . You are receiving this because you were mentioned.Message ID: @.***>

zonca commented 1 year ago

Everything correct. I'll get started and share updates here.

On Thu, Oct 26, 2023, 14:01 Julian Borrill @.***> wrote:

OK - sounds like a temporary plan while we try to get NERSC to change its policy. In the long term we want everything in one place so we don't need to move things when they become public. In the short term this would mean that the CMB-S4 stuff would be at NERSC and the PySM PanEx skies at UCSD, correct?

J

On Thu, Oct 26, 2023 at 1:57 PM Andrea Zonca @.***> wrote:

1 portal will point to the 2 different sites, public datasets will point to UCSD, private to NERSC. Users will hardly notice.

— Reply to this email directly, view it on GitHub < https://github.com/CMB-S4/serverless-data-portal-cmb-s4/issues/28#issuecomment-1781884476>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAC4LSSQMSOLEOEBCABB2PLYBLFERAVCNFSM6AAAAAA6OU5BLSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBRHA4DINBXGY>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/serverless-data-portal-cmb-s4/issues/28#issuecomment-1781889289, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC5Q4SRWQDVDWXNVLVQ553YBLFUDAVCNFSM6AAAAAA6OU5BLSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBRHA4DSMRYHE . You are receiving this because you were mentioned.Message ID: @.***>

zonca commented 1 year ago

started to work on moving DC-0 LAT in https://github.com/CMB-S4/serverless-data-portal-cmb-s4/pull/29

zonca commented 1 year ago

ok, it worked fine. Now DC-0 LAT is served from NERSC, see all the links in https://data.cmb-s4.org/dc0-chlat-split01-025.html start with g-9f, everything else is still served through UCSD, links start in g-45.

I'll move the rest of DC-0 once we have the new maps, see #27. Planck and Panexp skies remain at UCSD.

zonca commented 11 months ago

we can now move public data to NERSC, see #31