biolink / biolink-model

Schema and generated objects for biolink data model and upper ontology
https://biolink.github.io/biolink-model/
Other
170 stars 71 forks source link

Confusion on PMC vs PMCID #1366

Open colleenXu opened 1 year ago

colleenXu commented 1 year ago

I notice a difference between PMC and PMCID, and I'm wondering if this is intentional. I'm also not certain on which to use.

Based on the prefix-map, it looks like:

All of the resources I'm working are providing "PMC"-style IDs that start with "PMC"...so it looks like I should use the prefix PMCID. Is that correct?


Part of my confusion comes from this documentation which shows both PMC and PMCID IDs that don't start with "PMC"...


Side note: do the prefix-maps need changing if URLs are being redirected? I'm noticing that

colleenXu commented 1 year ago

Perhaps @sierra-moxon would be the person to look into this?

colleenXu commented 11 months ago

(as discussed on Monday)

In bioregistry, the two prefixes (PMC, PMCID) are for the same namespace, which has local unique identifiers that start with "PMC" (regex: ^PMC\d+$, ex: PMC3084216).

VS in biolink-model, the two prefixes seem to have different patterns for local unique identifiers:

mbrush commented 11 months ago

I think this may be as simple as fixing the Biomodel prefix registry to add "PMC" to the end of the namespace for expanding the PMCID prefix (i.e. "PMCID": "http://www.ncbi.nlm.nih.gov/pmc/PMC").

If we do this, then the examples in the spec doc resolve, and we be consistent in not requiring anything but the numeric identifier for a pub to follow the prefix (whether it is PMID, PMC, or PMCID).

I created PR #1402 to make this simple change.

@sierra-moxon @colleenXu will this do the trick?

colleenXu commented 10 months ago

@mbrush @sierra-moxon

I'm not sure about doing this. Sierra told me that I should use bioregistry to find the "patterns for local unique identifiers"...so I was under the impression that if we were going to change to 1 pattern, that we'd pick bioregistry's method:

so PMC:PMC1234...

cthoyt commented 10 months ago

In https://github.com/biopragmatics/bioregistry/issues/965, we got authoritative confirmation from the PMC team that PMC local unique identifiers should contain the PMC. Therefore, curies should look like: pmc:PMC1234

gglusman commented 1 month ago

Related to the above, but not the same (or at least, I don't see this specific issue discussed): Is the domain for PMC entities PMC or PMCID? Biolink uses PMC, but all NCBI pages I've seen display PMCID: PMCnnnnnn (as opposed to PMC: PMCnnnnnn). Note I'm not referring to whether to include PMC after the colon, but what to use before it.