chanzuckerberg / single-cell-curation

Code and documentation for the curation of cellxgene datasets
MIT License
37 stars 23 forks source link

Update GENCODE and Ontology Bump Curator Report GHAs to no longer require a manual API Key rotation upon Key expiration #727

Open nayib-jose-gloria opened 8 months ago

nayib-jose-gloria commented 8 months ago

Background: After an update to the GENCODE and/or Ontology reference files, the next step in the migration workflow is to run a GHA that produces "curator reports" that surveys our production corpus and reports if any datasets contain ontology and/or GENCODE terms that are now-deprecated in the latest updated reference files. This information is then leveraged by Lattice or our own automation to populate the cellxgene-schema migrate command with dataset schema migration steps to replace these deprecated terms.

Task: As implemented, this action leverages a Super Curator machine account to survey private datasets for deprecated terms. This requires a super curator API Key, stored in AWS SecretsManager and fetched by the GHA. The API Key, however, expires roughly every 2 months, adding a manual step to this process where we need to regenerate the API Key and update in AWS SecretsManager.

Come up with plan to implement automated key rotation OR provision a permanent API Key for this machine account.

joyceyan commented 5 months ago

I looked into this a bit yesterday, but I'm unassigning myself because I'm not 100% sure I'll have this ready before I go on leave next week and I don't want to leave the ticket in a half-completed state.

nayib-jose-gloria commented 4 months ago

after discussion with @Bento007, we determined that adding a descriptive error message with clear steps about how to rotate the key is sufficient given how infrequently the key will expire