ExposuresProvider / cam-pipeline

Data loading pipeline for CAM database
https://exposuresprovider.github.io/cam-pipeline/
MIT License
2 stars 4 forks source link

Insert triple tagging models as GO-CAM or SIGNOR #76

Open balhoff opened 1 year ago

balhoff commented 1 year ago

This could be done with SPARQL updates in this step: https://github.com/ExposuresProvider/cam-pipeline/blob/95f98857d8b9215aba538bcbcbba6d75ad3f8350/Makefile#L68-L71

gaurav commented 1 year ago

The simple fix to this would be:

  1. Move line 71 to 69, so signor-models are loaded first.
  2. Use a SPARQL UPDATE to add some provenance information to all the SIGNOR models, i.e. something like: :model pav:importedFrom <https://signor.uniroma2.it/>
  3. Then import the remaining models, and annotate all remaining models with GO-CAM provenance.
  4. (Optional) Figure out some way to annotate CTD data similarly so we don't need to figure that out in cam-kp-api.
gaurav commented 1 year ago

Incidentally, there appears to be 41 distinct pav:providedBy values in the dev triplestore at the moment. I'm not sure if any of these would be useful to expose.