gwemon / P02-Biological-terms-review

A collaborative project to review and streamline biological codes in the SDN Parameter Discovery Vocab P02.
0 stars 0 forks source link

Rationalise abundance and biomass codes #3

Open Daphnisd opened 7 years ago

Daphnisd commented 7 years ago

Looking at option1 and option 2 on https://github.com/gwemon/P02-Biological-terms-review:

For option 1: it's unclear to me under which P02 how generic terms such as "Abundance of biological entity specified elsewhere per unit volume of the water body " http://vocab.nerc.ac.uk/collection/P01/current/SDBIOL01/" could be included. As the term is generic, it can be used for most of the functional groups which would have a dedicated P02. Would this term then be included under all relevant P02's? I do not see how we can make sure that a sure looking for all phytoplankton abundances will get all phytoplankton abundances and only phytoplankton abundances.

Option 2 seems as better fit for the generic terms, but may be less suited for to discover the non-generic terms. I don't understand how we would still be able to target specific groups using option 2, even with a one to many relationship.

I do not know if we want to continue with the non-generic terms? Are people still using them and if so whether they can be persuaded to switch to the generic ones? Perhaps we would need 2 separate systems and the non-generic one could in time be deprecated?

gwemon commented 7 years ago

Thanks Daphnis. Below are a few thoughts related to your comment:

It helps to focus on use cases so the one you mentioned of a user looking for Phytoplankton abundance data and only phytoplankton abundance data is one to keep in mind.

To start with your last question: I never considered deprecating the non-generic terms because they are very much in use (and new terms are requested). However I do not want to exclude any option. For me deprecation is not desirable because the two types of codes (SDN Bio and the taxon-specific codes) serve two valid use cases;

With option #1 the SDN-Bio P01 codes (ie. %biological entity specified elsewhere%) would remain mapped to BPRP (SeaDataNet biological format biotic parameters). We could look at options to make them more discoverable by improving the mappings between BPRP and the P03 categories for example. But still, the codes being generic there is no way that a user could filter on functional groupings unless the information is available.

With option#2, as it stands, yes we would loose the ability to target specific groups unless we introduce a many-to-many relationship between P02 and P01 and add the functional groups too. But still, it would not allow users to discover P01 codes associated with Phytoplankton abundance unless the user is able to select "phytoplankton" AND "abundance". On the other hand with option #1 the user selecting "Phytoplankton abundance" would get all the phytoplankton abundance codes mapped to that P02 group. But of course this would exclude any CDI containing SDN-Bio P01 codes. We might need another mechanism for filtering SDN-bio data?

To help us look at examples it might be good to focus on one user interface. Are SDN-bio format data available via the common interface at http://seadatanet.maris2.nl/v_cdi_v3/search.asp?

Daphnisd commented 7 years ago

Some examples of BIO-DEV datasets in SeaDataNet: Datasetname: