tdwg / gbwg

Genomic Biodiversity Interest Group
Apache License 2.0
15 stars 2 forks source link

DwC Mapping - MIXS:0000026 source_mat_id #18

Closed tucotuco closed 3 years ago

tucotuco commented 3 years ago
Field Value
subject_id http://rs.tdwg.org/dwc/terms/materialSampleID
subject_definition An identifier for the MaterialSample (as opposed to a particular digital record of the material sample). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the materialSampleID globally unique.
subject_usage_notes Recommended best practice is to use a persistent, globally unique identifier.
subject_examples 06809dc5-f143-459a-be1a-6f03e63fc083
predicate_id skos:exactMatch
object_id MIXS:0000026
object_label source_mat_id
object definition A unique identifier assigned to a material sample (as defined by http://rs.tdwg.org/dwc/terms/materialSampleID, and as opposed to a particular digital record of a material sample) used for extracting nucleic acids, and subsequent sequencing. The identifier can refer either to the original material collected or to any derived sub-samples. The INSDC qualifiers /specimen_voucher, /bio_material, or /culture_collection may or may not share the same value as the source_mat_id field. For instance, the /specimen_voucher qualifier and source_mat_id may both contain 'UAM:Herps:14' , referring to both the specimen voucher and sampled tissue with the same identifier. However, the /culture_collection qualifier may refer to a value from an initial culture (e.g. ATCC:11775) while source_mat_id would refer to an identifier from some derived culture from which the nucleic acids were extracted (e.g. xatc123 or ark:/2154/R2).
object source https://github.com/GenomicsStandardsConsortium/mixs-legacy/blob/master/mixs5/mixs_v5.xlsx
comment
raissameyer commented 3 years ago

Hi @tucotuco,

Many thanks for your work on this!

I just edited the table in the above comment to correctly reflect the object ID and label as given by you in our SSSOM mapping spreadsheet and issue title (previously said lat_lon in the table). Hope it's now as you intended it!

tucotuco commented 3 years ago

Yes, that's how I intended it. I must have missed replacing lat_lon with source_mat_id after pasting the other table as a template. Thanks.

raissameyer commented 3 years ago

Suggested syntax predicate for the mapping above https://github.com/tdwg/gbwg/issues/18#issue-805164468

Field Value
subject_id http://rs.tdwg.org/dwc/terms/materialSampleID
subject_value_syntax - expected_value - unit
syntax_predicate_id skos:exactMatch
object_id MIXS:0000026
object_value_syntax - expected_value - unit {text} - for cultures of microorganisms: identifiers for two culture collections; for other material a unique arbitrary identifier
syntax_comment Both expect a mix of letters, numbers, etc.

Note that DwC has this as part of its definition:

In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the materialSampleID globally unique.