leap-stc / data-management

Collection of code to manually populate the persistent cloud bucket with data
https://catalog.leap.columbia.edu/
Apache License 2.0
0 stars 6 forks source link

How do I check a catalog update upstream #133

Open jbusecke opened 5 months ago

jbusecke commented 5 months ago

Consider the following case: I just updated some info in a feedstock repo. Is there a way other than building the catalog on main via manually running this action to preview the resulting changes?

@andersy005

andersy005 commented 5 months ago

@jbusecke, i have been meaning to add a repository_dispatch trigger to the GH action. this should make it easy to rebuilding the catalog whenever there's a chance in feedstock repositories. i'm going to open a PR shortly

andersy005 commented 5 months ago

it appears the repository-dispatch mechanism is a little involved when it comes to multiple repositories that may or may not reside under the same GitHub organization. The dispatching mechanism requires adding the following step to the validate-catalog GH workflow

      - name: Dispatch event
        uses: actions/github-script@v4
        with:
          github-token: ${{ secrets.GH_USER_TOKEN }}
          script: |
            await github.repos.createDispatchEvent({
              owner: 'leap-stc',
              repo: 'data-management',
              event_type: 'update-catalog',
              client_payload: {
                origin_repo: `${github.repository}`
              }
            })

as you can see this step requires using a personal access token (secrets.GH_USER_TOKEN) with some specific permissions for this dispatch step to work. This requirement creates a token management headache since one has to either set this secret at either the repository level or at the organization level depending on the admin privileges/permissions they may or may not have. @jbusecke, i'm curious to hear your thoughts on this? in the meantime, the simplest option is to set the cron schedule to the minimum/acceptable interval of 5 minutes.

jbusecke commented 4 months ago

Thanks for the explanation @andersy005 .

My intuition is that the added complexity is not worth it?

But maybe we can add instructions that indicate that a manual rebuild or wait is needed to see updates? Happy for this to be closed via a small docs PR.

I would not set the cron trigger frequency this high, as this seems like a once-off scenario in many cases?