Open gaurav opened 3 years ago
I've been looking at the .ttl output, and there's another IRI in there, http://UNKNOWN.org/
used in the enumerations. (Maybe it needs to be the same as ccdh
's?)
In the DMH call just now, Matt and Brian mentioned that these should use the standards laid out in the DST publication on identifiers.
crdch:Entity.attribute
. Entity should be capitalized (which doesn't appear to be the case right now).https://w3id.org/crdc/v1.1/Entity.attribute
?crdc:dm0000431
@cmungall @jmcmurry @majensen Thoughts? We could schedule some time to discuss this on one of our calls, but it'd be great to have initial thoughts in this GitHub issue.
I don't quite understand the options; let's discuss over slack
We currently use a number of dummy prefixes in the CCDH model:
We should replace these with actual IRIs.
For the CCDH IRIs, we should probably register a
ccdh
orcrdc-h
orcrdch
prefix at w3id.org and use that.As per the Identifier Recommendations, the CRDC prefix will be at
https://w3id.org/crdc/
and this will be used as e.g. subjectcrdc:su0000001
(for a subject),crdc:st000002
(for a study), and so on. So it might make sense to reserve a two-letter code for the model (dm
?) and make properties based on that, but I think we'd prefer e.g.ccdh:BodySite__site
rather thancrdc:dm0000431
.For the node IRIs, this is primarily a convenience tool so we can use LinkML mapping fields, which use CURIE mappings. We could ask LinkML for non-CURIE mapping fields and use those instead, or we could try to find actual IRIs that make sense (e.g. for Sample.sample_type, we can construct the pretty odd IRI https://docs.gdc.cancer.gov/Data_Dictionary/viewer/#?view=table-definition-view&id=sample&anchor=sample_type to look up its documentation). We could also ask GDC to mint identifiers for their properties.
We actually have another odd possibility for node properties: many of them are present in the caDSR and the NCI Thesaurus, so instead of
GDC:sample.sample_type
we could saycaDSR:3111302v2.0
orNCIT:C70713
.