tdwg / gbwg

Genomic Biodiversity Interest Group
Apache License 2.0
15 stars 2 forks source link

MIxS IRIs to link DwC and MIxS #11

Open raissameyer opened 3 years ago

raissameyer commented 3 years ago

Hi @wdduncan

The MIxS IRIs are going to be essential for this Task Group. Great that you are developing them!

Would you know if there is a Beta release just for us to experiment with? If so, would you mind linking to that resource here?

Best, Raïssa

wdduncan commented 3 years ago

@raissameyer Work is still in progress on finalizing IDs. Currently, IRIs are being stored in a spreadsheet: https://docs.google.com/spreadsheets/d/1QDeeUcDqXes69Y2RjU2aWgOpCVWo5OVsBX9MKmMqi_o/edit#gid=567040283

You may have to request access though.

You can look in columns AC, AD on the MIxS6 Core - to edit tab, and columns L, M on the MIxS6 packages - to edit tab. This is still very much a work in progress.

raissameyer commented 3 years ago

@wdduncan Many thanks for linking to the resource!

timrobertson100 commented 3 years ago

You can look in columns AC, AD on the MIxS6 Core - to edit tab

So using the example of MIXS:0000004 - is this the full IRI (i.e. a scheme of MIXS) or a placeholder, please? Are there guidelines for how to use these with http?

Thanks

thomasstjerne commented 3 years ago

For MIxS terms that have an expected value of type enumeration, the value syntax is provided as a controlled vocabulary such as e.g. [metabat|maxbin|concoct|groupm|esom|metawatt|combination|other] for the structured comment name bin_software (MIXS:0000078).

In Darwin core these enumerations are extracted to (sub)vocabularies (used for picklists in the IPT software etc). Each of the concepts in an enumeration must have an URI as seen here with mocked example URIs

My question is, will you issue IRIs for each concept in the enumerations, and how should we form the URIs for the vocabulary (enumeration) concepts?

Should I open a separate issue for this?

wdduncan commented 3 years ago

@thomasstjerne If I understand your question (and I might not ... sorry), I think the URI field would (once MIxS 6 is release) reference the MIxS CURIE for the term.

Does that make sense?

only1chunts commented 3 years ago

I concur with @wdduncan , at this time GSC is not planning on creating URIs for each term in suggested vocabularies. We hope to be able to use existing ontology terms in most instances, hence the use of CURIEs as Bill says. We do intend to be seeking homes for our suggested values in already existing ontologies and to assist whichever body takes responsibility for said ontology. For the "bin_software" example you cite, it might actually be prudent for us to move away from any sort of controlled vocabulary as that list will be continuously changing! Instead, I would suggest we use something like an RRID (or any appropriate PID) to identify the software used. I have create an issue in the GSC mixs github repo to get this discussed.

thomasstjerne commented 3 years ago

Thanks @wdduncan and @only1chunts Can you answer the question from @timrobertson100 also:

So using the example of MIXS:0000004 - is this the full IRI (i.e. a scheme of MIXS) or a placeholder, please? Are there guidelines for how to use these with http?

wdduncan commented 3 years ago

MIXS:0000004 is a CURIE.

The MIXS part expands into a defined named space prefix. The name space, as far as I know, has not been registered yet.

For example, in obo:ENVO_01001357, the obo part expands to http://purl.obolibrary.org/obo/, and the full IRI is http://purl.obolibrary.org/obo/ENVO_01001357, which does resolve in a web browser.

thomasstjerne commented 3 years ago

@wdduncan any news on the name space registration?

wdduncan commented 3 years ago

@thomasstjerne Not yet. They are still finalizing mixs 6.

timrobertson100 commented 3 years ago

They are still finalizing mixs 6.

Thanks, @wdduncan. Is there a timeline for when this will be done, please? If it helps, for the extension definition the URIs don't need to resolve - but we need to know what the URIs will be.

wdduncan commented 3 years ago

I think it is safe to use the IRIs currently in the spreadsheet. I know they want to get the mixs 6 release out soon.

timrobertson100 commented 3 years ago

I think it is safe to use the IRIs currently in the spreadsheet. I know they want to get the mixs 6 release out soon.

Thanks. What does the MIXS: expand to please?

wdduncan commented 3 years ago

MIXS: is supposed to expand to: https://w3id.org/gensc.org/terms/MIXS_

You can follow the conversation here: https://github.com/GenomicsStandardsConsortium/mixs-rdf/issues/2

The request for the w3id space is supposed to happen in the next few weeks.

thomasstjerne commented 3 years ago

@wdduncan reading the last comment in that conversation it seems that it would be https://w3id.org/gensc/terms/MIXS_ (using gensc rather than gensc.org) ?

wdduncan commented 3 years ago

Yes. There was a typo in my previous post. Sorry :(