Open PietrH opened 1 year ago
Repeatable with:
raw_data <- readr::read_csv(
here::here("data", "raw", "rato_data.csv")
filter(raw_data, Dossier_ID == 41583)
@LienReyserhove
In some records, there are dutch vernacular names in Opmerkingen
, for example:
"12/06: sabotage, 07/07: Zaagrugschildpad ( groot exemplaar)"
in occurrenceID: 498235
This is mapped to a different turtle, based on the GBIF id. I believe they were trying to catch 2 individuals of species a
, but caught 1 of a
and one of b
. The data however shows this as 2 of a
Do I understand this correctly? How would you handle this? Should this be fixed on the data side, should we make an exception just for this record in the mapping
Do you think we should systematically check for vernacular and scientific names in Opmerkingen
that don't match the GBIF_Code
as part of the tests?
Same table as above, but less columns:
Dossier_ID | OBJECTID | Dossier_Status | Soort | Waarneming | Actie | Opmerkingen | Laatst_Bewerkt_Datum |
---|---|---|---|---|---|---|---|
41583 | 428110 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | NA | NA | 2023-05-19 13:07:41 |
41583 | 430568 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | NA | NA | 2023-05-22 11:42:24 |
41583 | 437818 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | Vangst (aantal) = 1; | NA | 2023-05-26 10:42:52 |
41583 | 438973 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | NA | NA | 2023-05-30 08:42:31 |
41583 | 444885 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | NA | NA | 2023-06-02 09:20:24 |
41583 | 446113 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | NA | NA | 2023-06-05 10:13:53 |
41583 | 451724 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | NA | NA | 2023-06-08 12:47:56 |
41583 | 455180 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | NA | 12/06: sabotage | 2023-06-12 13:00:05 |
41583 | 462626 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | NA | 12/06: sabotage | 2023-06-19 11:20:37 |
41583 | 471851 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | NA | 12/06: sabotage | 2023-06-26 12:25:41 |
41583 | 477783 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | NA | 12/06: sabotage | 2023-06-30 13:48:12 |
41583 | 487067 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | Vangst (aantal) = 1; | 12/06: sabotage, 07/07: Zaagrugschildpad ( groot exemplaar) | 2023-07-07 13:04:32 |
41583 | 492257 | Opvolging | Lettersierschildpad | Vastgesteld (aantal) = 2; | NA | 12/06: sabotage, 07/07: Zaagrugschildpad ( groot exemplaar) | 2023-07-14 20:23:27 |
41583 | 498235 | Verwerkt en afgesloten | Lettersierschildpad | Vastgesteld (aantal) = 2; | NA | 12/06: sabotage, 07/07: Zaagrugschildpad ( groot exemplaar) | 2023-07-24 09:38:24 |
The raw data reports species "Lettersierschildpad" with the correct gbif id, however in
Opmerkingen
there is mention of :12/06: sabotage, 07/07: Zaagrugschildpad ( groot exemplaar)
, which is Graptemys sp.This event has been in the dataset for a while, occurrences in current output:
In the raw data we can see only the last 3 records make mention of Graptemys sp.
There are probably more examples of vernacular names being mentioned in
Opmerkingen
, some of which might also not match theGBIF_Code
provided.For now, I will not make an exception and parse just these records to add a new species, but I'll create an issue to be discussed when everyone involved is back from holidays.
Originally posted by @PietrH in https://github.com/riparias/rato-occurrences/issues/50#issuecomment-1667701498