DataONEorg / mnlite

Light weight read-only DataONE member node in Python Flask
Apache License 2.0
0 stars 0 forks source link

D1 mn onboarding (mnlite init, plus scraping and testing of metadata) #17

Closed iannesbitt closed 1 year ago

iannesbitt commented 1 year ago

Member node onboarding

scripts streamlining this process

step 1: data gathering

script outline:

  1. load/gather member node data (mn id, orcid, description, submitter, technical contact, sitemap urls, etc.)
  2. . use member node identifier to initialize new node in opersist ("urn:node:mnTestBONARES" -> instance/nodes/mnTestBonares/)
  3. look up subjects by orcid using CNIdentity.getSubjectInfo() and create if necessary (submitter, technical contact, node subject)
  4. (if necessary) dump new json to instance/nodes/mnTestBONARES/node.json
  5. system call to sudo systemctl restart mnlite
  6. use scrapy with sitemap urls to crawl database
  7. use pyshacl to test records against science-on-schema.org shape graph and raise/log error if non-compliant
  8. (production) add update schedule to crontab pending outcome of step 7
iannesbitt commented 1 year ago

step 2: node registration and approval

script outline:

  1. ...
mbjones commented 1 year ago

@iannesbitt I took a quick look at this, but didn't have tie to dive in. Can I suggect that you present the approach at a Thursday dev meeting to get feedback from me, @datadavev , @taojing2002 and others on the onboarding process as encapsulated here?

iannesbitt commented 1 year ago

@mbjones yes that sounds like a good idea. I will put it on the schedule for next week.

iannesbitt commented 1 year ago

@mbjones actually I will be out next Thursday so it'll have to wait until March 2.