biopragmatics / bioregistry

📮 An integrative registry of biological databases, ontologies, and nomenclatures.
https://bioregistry.io
MIT License
114 stars 49 forks source link

Update Korean Clinical Research Information Service #985

Closed lnanderson closed 4 months ago

lnanderson commented 9 months ago

Prefix

kcris

Explanation

The Clinical Research Information Service (CRiS), Republic of Korea prefix needs to be corrected. The prefix kcris is unresolvable (see error).

Suggested Update

Update Example

Contributor ORCID

0000-0002-8741-7823

bgyori commented 9 months ago

Hi @lnanderson, it appears to me that the choice of prefix itself isn't an issue here. My guess is that the actual reason identifiers like KCT0008394 cannot be resolved directly in this resource is that the URL https://cris.nih.go.kr/cris/search/detailSearch.do?seq=23973&search_page=L&search_lang=E&lang=E doesn't contain the ID anywhere. We could double check if there is a URL pattern that cris.nih.go.kr provides that uses an ID like KCT0008394 to get to the same landing page which we could then use as a URL pattern.

Based on your examples above, the WHO is actually able to resolve these IDs as an external provider: https://trialsearch.who.int/Trial2.aspx?TrialID=KCT0008394 so that could still be catalogued on this Bioregistry entry.

lnanderson commented 9 months ago

@bgyori thanks for clarifying. I suggested cris as a prefix correction to remain consistent with the established acronym provided by the data provider (at your discretion).

I did notice they are using an entirely different URL pattern (seq=#####), and this case seq=23973. I tried a few other registration numbers and the same sequence pattern resolves of course but with a different set of numbers.

On a side note, I wonder if it is worth adding ICTRP as a new prefix instead and adding it to the collection as it commonly resolves with "TrailID=" regardless of the primary registry number. For example, TrialID:NCT05724472 and TrialID:KCT0008394. Thoughts?

cthoyt commented 9 months ago

I second what ben says, it's actually noted in the description on. https://bioregistry.io/registry/kcris that local unique identifiers within kcris can't be resolved

The URL https://trialsearch.who.int/Trial2.aspx?TrialID=KCT0008394 is interesting. This works because the community of clinical trial registry developers and maintainers doesn't actually use CURIEs, but each embeds a pseudo-prefix in the local unique identifiers. This makes it possible to have a service like the WHO clinical trial search that can resolve a wide variety of different semantic spaces without using CURIEs.

I disagree with minting a prefix that groups all of these. They are different semantic spaces, representing different data from different sources with different identifiers patterns. However, I think there might be a way to hack into the Bioregistry a way to make them all resolve using this format string

wrt KCRIS vs CRIS, I dug into a lot of resources, and I also saw KCRIS used. CRIS might have had a collision with another resource, too

lnanderson commented 9 months ago

@cthoyt I see the note now, thanks. It would be interesting if you could link in the string mentioned somehow. I have been noticing either a language barrier that makes it challenging to interpret some primary registry sources OR in some cases a primary repository is no longer being sustained and will have to fully migrate all content to the ICTRP portal only moving forward (see example here for both TCTR and NTR).

Feel free to close this issue as you see fit. Thank you both!