LTER-LIFE / FDFDT

FAIR Data for Digital Twins
0 stars 0 forks source link

Opacity of eventIDs #15

Open CherineJ opened 5 months ago

CherineJ commented 5 months ago

Creating reasonable eventIDs remains a challenge in every data set. So far, we built all eventIDs (and other IDs accordingly) by combining information from different fields in the data that uniquely describe the event (eg., year-block-treatment-plot). By this, the ID directly gives information about the event itself and is relatively human readable.

However, some research on persistent identifiers brought me to a GBIF guide ("A Beginner’s Guide to Persistent Identifiers") that states:

"It is important to understand that Persistent Identifiers are intended for computers to communicate with other computers. As such, they should be invisible to most users. In fact one of the important qualities of a good Persistent Identifier is opacity. That is, the identifier itself should not contain any readily identifiable information."

We need to check again though, whether this only applies to PIDs of the data set level or also to the record level.

As our current ID structure is not compliant with that, we would then also need to check whether we want/have to change that in the future.

StefanVriend commented 5 months ago

Related:

In a EUDAT presentation on what makes a PID 'good':

Opaque, a "dumb number". A well known pattern invites assumptions that may be misleading. Meaningful semantics invite intellectual property disputes, language problems

NOID: Nice Opaque IDentifier.