DINA-Web / taxonomy

DINA Taxonomy module
MIT License
1 stars 0 forks source link

Taxonomy terms #4

Closed kcranston closed 6 years ago

kcranston commented 6 years ago

In the use case glossary, we define a classification as a hierarchy of taxon. In PR #1, I suggest a mrca method that takes two or more taxa as parameters, and @gnewton suggests replacing taxon with node, with the comment:

The nodes are where the taxa have been placed or assigned. The taxa exist independently of the tree and the node. This is my nomenclature so we can rename it but this is the representation.

This is a good point - we do need to be clear about terminology. Not convinced that we want to use both 'taxon' and 'node' in the API methods / parameters, though. Are a taxon and a node different entities in the taxonomy model? Do they each have their own IDs? How does a user get a nodeID?

cgendreau commented 6 years ago

I see the nodes as a consequence of handling multiple classifications. If you look at a single classification, a taxon can only be at one place (including not in the classification). So, in multiple classifications, if we specify the id of the tree, I would think it's probably ok to use the taxonId directly. But, internally a "nodes" table would be queried so we could say that the real resource is the node and not that taxon. On the same topic the object returned by a /node/1 would be different than the taxon since the taxon would also return all the nodes that points to it.

If we support multiple parallel classification I would say that adding the node resource looks like a good idea to me. I would then add a new term in the glossary with the definition: "Instance of a taxon inside a specific classification". I would prefer a more specific term than "node" but I'm not sure I can do better than "taxon_node" :) .

gnewton commented 6 years ago

A couple of things: 1 - Assumption: the API is for developers, not for taxonomists. While it would be optimal for terms to be the same everywehere, there may be times where the terms used in the API may be different from that used by taxonomists when it makes sense. 2 - We need to support multiple classifications. Different collections will be in different classifications in many institutions. At AAFC, it is likely to go this way, but not clear. So a taxon needs to be able to be in multiple classifications. The data structure that makes the classification<--->taxon relationship is the node. I am not keen on 'taxon_node'. But I am not too tied to naming, except when it might lead to confusion. But I am pretty sure this is the way we need to go.

On 20 February 2018 at 20:20, Christian Gendreau notifications@github.com wrote:

I see the nodes as a consequence of handling multiple classifications. If you look at a single classification, a taxon can only be at one place (including not in the classification). So, in multiple classifications, if we specify the id of the tree, I would think it's probably ok to use the taxonId directly. But, internally a "nodes" table would be queried so we could say that the real resource is the node and not that taxon. On the same topic the object returned by a /node/1 would be different than the taxon since the taxon would also return all the nodes that points to it.

If we support multiple parallel classification I would say that adding the node resource looks like a good idea to me. I would then add a new term in the glossary with the definition: "Instance of a taxon inside a specific classification". I would prefer a more specific term than "node" but I'm not sure I can do better than "taxon_node" :) .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DINA-Web/taxonomy/issues/4#issuecomment-367180721, or mute the thread https://github.com/notifications/unsubscribe-auth/AAwbhpllYQq1FlWXN4NtRn-CA-DD14S-ks5tW29JgaJpZM4SIodx .

kcranston commented 6 years ago

Closing this issue, as I think the questions about objects and names will get sorted out when we make decisions about the model (see #2 ) and the API (see #3 ).