CatalogueOfLife / general

The Catalogue of Life
49 stars 5 forks source link

add a fossil geologic time range field #54

Closed mdoering closed 5 years ago

mdoering commented 5 years ago

For palaeo taxa it is key to know the geologic time the organism was known to have lived, i.e. the range of geologic times it is known from the fossil record.

Implementation would be best based on a start and an end field (integer or double) representing million years (Ma). Input could then also be parsed from known geological times like "Trias, Juras", but it allows to offer all kinds of range searching

mdoering commented 5 years ago

see https://github.com/Sp2000/coldp/issues/17

mdoering commented 5 years ago

An issue to consider is that the exact year ranges for geological periods are changing quite frequently.

yroskov commented 5 years ago

PaleoBioDB demonstrates professional approach. It uses two fields Age range and Distribution with geological periods. For example, Age range: 55.8 to 5.332 Ma Distribution: found only at Rio Picheleufu (Eocene to Miocene of Argentina).

However, it will be further expansion of standard dataset.

mdoering commented 5 years ago

It is an expansion, but one that strongly relates to the existing extinct flags and one that will be vital for any serious use of fossil names. Please see the ColDP issue above for a longer discussion already

mdoering commented 5 years ago

... and by the way, PaleoBioDB is using at least 3 slightly different ways to express the same. Age range, min_ma & max_ma for min/max million years and eag & lag for earliest/latest age is used in different places in their API and webpages.

An interesting blog post about PaleoBioDBs fossil age data mentions first and last appearance dates (FADs and LADs) of taxa.

GBIF and DwC have used livingPeriod for a decade now which better applies also to extant species for which age is usually understood as a property of an individual or the average/max age of a species.

Wikipedia uses temporalRange for the well known species pages infoboxes.

The International Fossil Plant Names Index again uses Stratigraphy.

mjy commented 5 years ago

... and by the way, PaleoBioDB is using at least 3 slightly different ways to express the same. Age range, min_ma & max_ma for min/max million years and eag & lag for earliest/latest age is used in different places in their API and webpages.

At the persistence level Isn't this just two ways, and aren't they basically the same as what you propose?

Age range is just concat min_ma, max_ma. Eag and Lag are the "textual" correspondances.

I think your 4 field model is fine @mdoering. 2 for exact (best practice), 2 to cached string values (whose meaning unfortunately can be altered every year by the Union, these are for caching corresponding labels OR accomodating those who don't have exact date ranges, i.e. doing the best that we can without exact values).

mjy commented 5 years ago

Careful- "Stratigraphy" is not the same thing. It's names for geological layers, not temporal periods.

Perhaps FAD and LAD are better ways of thinking of values for CoL+? I.e. they are summaries of what is known?

yroskov commented 5 years ago

Relationship between Stratigraphy and Geologic time scale: https://en.wikipedia.org/wiki/Geologic_time_scale

mdoering commented 5 years ago

@mjy if you look at the IFPNI values for their Stratigraphy you can see they provide geological time periods there, e.g. Emsian. But yes, stratigraphy really isn't the same. It is just the basis for knowing about a specimens age.

mjy commented 5 years ago

@mdoering I think this is labelling overlap, not conceptual identity. When persisted we need to be careful what we are asserting.

For example, if you tell me there is a temporal time scale, and that you have the Emsian period, then I can tell you, according to the standard, in any given year, without question, what the preceeding and postceding temporal time periods are. I can also be assured that the temporal bounds of the Emsian period may change if the standard changes. I can not tell you anythign about the physical makeup of that layer.

If you tell me you have the Emsian stratigraphic layer, I can tell you that you probably mean a layer with some temporal bounds that match the temporal time scale, but I can not tell you the stratigraphic layers I would find immediately above and below it, these may be altered by geological processes, re-arranged, completely removed, basically "randomized". I can also fix the date for my stratigraphic layer, and, when he temporal standard changes this may, or may change for my specific location depending on what I know about the layer there. Furthermore, I can likely find out more about the actual physical makeup of that layer by finding the published stratigraphic colum that defined the layer (in that place).

Stratigraphic nomenclature concerns more immediate, instance/field based science, temporal nomenclature reflects a broader summary/understanding. Mixing the labels across temporal/stratigraphic is a convience, rather than a best practice IMO. Unfortnately, from my understanding, there are many, non-goverened stratrigraphic labels that are not in any unified temporal time frame, it really is a mess.

So, users re-use labels for temporal time periods when they help to define stratigraphic concepts, but not all stratigraphic labels are to be considered temporally arranged, etc. etc.

... i.e. we're agreeing I think.