fdiwg / fdi-codelists

FDI codelists
0 stars 2 forks source link

Taxonomy in ASFIS #15

Open manuchassot opened 1 year ago

manuchassot commented 1 year ago

The ASFIS code list includes information on FAMILY and ORDER. This taxonomic information can be very useful for some analyses but it is not clear (to me) where the information comes from (which authority?), and how it is updated. Furthermore, other taxonomic information (tribe, subfamily, etc.) might be useful to the ASFIS users for different studies.

Associating each ASFIS species or species group to the identifiers used in global taxonomic databases may be very useful to remotely retrieve taxonomic information from some reference taxonomic databases such as ITIS (see https://ropensci.org/packages/taxonomy/).

Simple example below for Thunnus albacares:

library(data.table) library(ritis)

EXAMPLE = data.table(SPECIES_CODE = "YFT", SPECIES_SCIENTIFIC = "Thunnus albacares", TSN = 172423)

EXAMPLE[, ORDER := data.table(hierarchy_full(TSN))[rankname == "Order", taxonname], by = .(SPECIES_SCIENTIFIC)]

EXAMPLE[, FAMILY := data.table(hierarchy_full(TSN))[rankname == "Family", taxonname], by = .(SPECIES_SCIENTIFIC)]

I propose to add the identifiers from ITIS and WORMS and possibly other databases in ASFIS or to do the work on FAO side and extend ASFIS with the standard taxonomic levels in complement of ORDER and FAMILY.

eblondel commented 1 year ago

IMHO (personal opinion only) The rationale of ASFIS is on the codelist and as minimum a common vernacular name, and a scientific name used for reference (snapshot at release time of what the taxonomic community agreed to use as valid scientific name). It becomes a hierarchical codelist/classification when we try to code an upper level in the taxonomic tree. All the taxonomic information becomes quite "static" in its current version, although they are existing systems like ITIS, WoRms, FishBase that are specialists of the taxonomic information.

We may need to set a valid scientific name at each ASFIS release (which could then be referred in case studies exploiting ASFIS as flat 'codelist'), but this maintenance should be based on a regular update made through connection and taxonomic hierarchy data pull with these systems like ITIS, so we keep ASFIS in sync with specialized taxonomic information systems.

Quid when these specialized knowledge bases are not aligned between them! Sometimes we have some discrepancies between ITIS, WoRMS, FishBase, not sure why, maybe also because of update time lags.

The opportunity to update ASFIS taxonomic information based on these systems should be presented. It's even more useful since we are not starting from scratch, and that libraries (like ritis in R) already allow to do things automatically and quickly, which should also foster more regular updates of ASFIS.

manuchassot commented 1 year ago

Following what you said about potential inconsistencies between WORMS and ITIS, the following example shows 2 different orders for yellowfin tuna:

library(data.table) library(ritis) library(worrms)

EXAMPLE = data.table(SPECIES_CODE = "YFT", SPECIES_SCIENTIFIC = "Thunnus albacares", TSN = 172423, APHIAID = 127027)

EXAMPLE[, ORDER_ITIS := data.table(hierarchy_full(TSN))[rankname == "Order", taxonname], by = .(SPECIES_SCIENTIFIC)]

EXAMPLE[, ORDER_WORMS := data.table(wm_classification(APHIAID))[rank == "Order", scientificname], by = .(SPECIES_SCIENTIFIC)]

EXAMPLE[, FAMILY_ITIS := data.table(hierarchy_full(TSN))[rankname == "Family", taxonname], by = .(SPECIES_SCIENTIFIC)]

EXAMPLE[, FAMILY_WORMS := data.table(wm_classification(APHIAID))[rank == "Family", scientificname], by = .(SPECIES_SCIENTIFIC)]

SPECIES_CODE SPECIES_SCIENTIFIC TSN APHIAID ORDER_ITIS ORDER_WORMS FAMILY_ITIS 1: YFT Thunnus albacares 172423 127027 Perciformes Scombriformes Scombridae FAMILY_WORMS 1: Scombridae

On Fri, Apr 14, 2023 at 4:23 PM Emmanuel Blondel @.***> wrote:

IMHO (personal opinion only) The rationale of ASFIS is on the codelist and as minimum a common vernacular name, and a scientific name used for reference (snapshot at release time of what the taxonomic community agreed to use as valid scientific name). It becomes a hierarchical codelist/classification when we try to code an upper level in the taxonomic tree. All the taxonomic information becomes quite "static" in its current version, although they are existing systems like ITIS, WoRms, FishBase that are specialists of the taxonomic information.

We may need to set a valid scientific name at each ASFIS release (which could then be referred in case studies exploiting ASFIS as flat 'codelist'), but this maintenance should be based on a regular update made through connection and taxonomic hierarchy data pull with these systems like ITIS, so we keep ASFIS in sync with specialized taxonomic information systems.

Quid when these specialized knowledge bases are not aligned between them! Sometimes we have some discrepancies between ITIS, WoRMS, FishBase, not sure why, maybe also because of update time lags.

The opportunity to update ASFIS taxonomic information based on these systems should be presented. It's even more useful since we are not starting from scratch, and that libraries (like ritis in R) already allow to do things automatically and quickly, which should also foster more regular updates of ASFIS.

— Reply to this email directly, view it on GitHub https://github.com/fdiwg/fdi-codelists/issues/15#issuecomment-1508424047, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7DKUT3IAPSG5TC4FHXCXTXBE6SVANCNFSM6AAAAAAW6KH6QQ . You are receiving this because you authored the thread.Message ID: @.***>