CatalogueOfLife / coldp

33 stars 11 forks source link

Handling more than 1 type per name? #24

Closed gdower closed 4 years ago

gdower commented 4 years ago

In order to handle multiple types per name, we might need to move types into a separate table.

mdoering commented 4 years ago

Like Taxon.referenceID typeMaterial is allowed to contain multiple values by concatenating them. I see we do miss the expected delimiter in the field definition. I would suggest to use | as commas which we use for Taxon.referenceID is likely to be used inside the value itself.

mdoering commented 4 years ago

in case the typeReferenceID and typeStatus differ for each type specimen things get a bit messy

mdoering commented 4 years ago

@gdower would you think its better to handle types with a dedicated TypeSpecimen entity? it becomes more difficult to map, but more accurate.

gdower commented 4 years ago

Scarabs has ~1500 names with more than 1 type almost all of which have a different type status, and some have different type references. I think it would be better organized and probably easier for most users to handle in a separate NameType table.

mdoering commented 4 years ago

A separate entity makes sense. I just wanna avoid an intermediate many-many "table". If we attach types only to original names / protonyms and not recombinations the vast majority of types is unique to a name. For illegal names, conserved ones etc this might be different.

Options

  1. types have a unique ID and you refer to them from the name and allow to concat multiple type ids like we do with referenceID. This allows also for recombinations to cite types.
  2. types have a nameID property that can also be a list if the same type refers to multiple names. Gets cumbersome for recombinations and we should recommend the use for original names only
  3. separete TypeDesignation table that connects a single nameID with a typeID and also takes the referenceID to refer to a publication for the (subsequent) designation. (in the scenarios above the referenceID will be part of the types table instead. Most correct but also most complex for publishing just a single type citation.
mdoering commented 4 years ago

what about this?

Bildschirmfoto 2019-12-17 um 15 43 42
mjy commented 4 years ago

Of course the questions will come with the collection object fields, host, date, locality. People are going to want catalog number, male, female, etc. Practically an infinite number of fields.

In other words this is a stop-gap solution for the real model that has a seperate concept for CollectionObjects. I would propose that you have 3 fields if you go this route, all verbatim - verbatim collecting events, verbatim determinations, and verbatim other labels. Since the value in the fields will be completely uncomputable anyways people can stuff what they want into these three and move on.

For what it's worth TypeMaterial in TW is the relationship between a nomenclatural name and a specimen, nothing more. It's also important to note that this is not the same thing as a taxon determination because this relationship is objective, and a taxon determination is subjective.

So, no easy answer on your drift towards the TW model ;)

On Tue, Dec 17, 2019 at 8:44 AM Markus Döring notifications@github.com wrote:

what about this? [image: Bildschirmfoto 2019-12-17 um 15 43 34] https://user-images.githubusercontent.com/327505/71005526-124eae00-20e4-11ea-9425-8384684a4f2e.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Sp2000/coldp/issues/24?email_source=notifications&email_token=AAAFQSCM3CXN2LSD4YU32NLQZDQULA5CNFSM4JZUG2Q2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHCTDMI#issuecomment-566571441, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAFQSASYNKET5I3TQZV2BDQZDQULANCNFSM4JZUG2QQ .

mdoering commented 4 years ago

It's tempting to flesh out all properties, but that is indeed a beast as you can see in Darwin Core. Also what use is it to have parsed out specimen data? All we really interested in is the type status, the reference in case of subsequent designations, the human readable citation and if existing a link to the specimen elsewhere. I doubt having type locality separate from collector, institution code, catalogue number, collecting date, host, etc helps us in any way. Maybe its best to stick with the most simple approach we had when it was part of the Name entity:

mdoering commented 4 years ago

... country or coordinate could be useful to present the specimen on a map

mdoering commented 4 years ago
Bildschirmfoto 2019-12-17 um 18 08 17