EarthCubeGeochron / Sparrow

A software tool and schema+API spec for connecting laboratory measurements to data consumers
https://sparrow-data.org
Mozilla Public License 2.0
14 stars 3 forks source link

How to track the source of data in Sparrow databases #111

Open yeshancqcq opened 3 years ago

yeshancqcq commented 3 years ago

The currently proposed way is to add 2 new fields to the database to track:

  1. the compilation (e.g. the EarthChem, the 10Be dataset by Heyman, etc.)
  2. the original publication

These fields can either be directly added to the sample table or can be added as a new datum (similar to the analysis) so they can be rendered along with other data on the individual sample page.

davenquinn commented 3 years ago

For 1: This is the key problem. We'll have to track not only the compilation, but also (in some cases such as IGSN or Geochron) a specific link to the page from which the data was sourced. It's worth keeping in mind also that we will have some cases (hopefully an increasing number as Sparrow becomes more useful!) where data will be sourced from one location but linkable to multiple external compilations.

We won't just need to link samples: in some cases sessions (or maybe even projects) are the right level to link to external resources. For example, in the LaserChron lab we have some samples that are measured repeatedly as calibration targets in different projects.

Maybe we need some sort of external_link table that stores source name and link and can be linked to multiple models within Sparrow...

For 2: We have a model to track publications, which can be linked to projects, samples, and sessions. So that should not need a separate field; we should just make sure publication data goes into the database at the correct level for links to be made.