OBOAcademy / obook

OBO Organized Knowledge: Training materials for becoming an OBO engineer
https://oboacademy.github.io/obook/
Creative Commons Zero v1.0 Universal
88 stars 38 forks source link

Discussion: Do we really need an external PURL system for our PIDs? #462

Open matentzn opened 7 months ago

matentzn commented 7 months ago

There are some heavy battles around this in Monarch and beyond. There are two clashing sentiments:   Camp A: "A well-funded resource should mint their own entity purls" with the rationale that (1) no true PURL system exists (the very concept makes no sense on the web if nothing can be truly persistent if it has to be managed/maintained/paid for) and (2) its nice to have total control over the domain (i.e. what happens if a commercial entity "buys" the domain and you have 250K worth of tax dollars invested in standardising your entity IRIs to that resource). The idea is that you basically manage your own PURL system, as a htaccess file or some such, document it somewhere and you are good to go (eg https://rarediseases.info.nih.gov/entity/disease/6816 -redirects to-> https://rarediseases.info.nih.gov/diseases/6816/Index).

Camp B: "PURLS are not URLs, and we should use a persistent identifier system specifically designed to mint GUPRIS." For SSSOM, LinkML and many others, we have resorted to https://github.com/perma-id/w3id.org (privately maintained, NOT related to w3c, example https://w3id.org/sssom). Previously we used https://purl.org/. The main argument on CAMP Bs side is that the entities should "live longer than the resources that contained them". So if SSSOM (website registry etc) dies (example above), the identifiers still "work" as the global standard to refer to SSSOM elements for example, and can be redirected to an archive like Zenodo if need be.

I personally was for most of my life firmly in Camp B, but very smart people that I respect a lot have pulled me closer to Camp A.

Here is my rule of thumb: Do you expect your resource (URL/Domain) to exist on a 

  1. 10 year+ timeframe? -> Camp A.
  2. 0-5 year timeframe -> Camp B.
  3. 5-10 year timeframe -> Judgement call with a mild preference to Camp B.