BelgianBiodiversityPlatform / data-publication-GBIF

data published on human health
MIT License
0 stars 0 forks source link

Dataset Desmodes overview #1

Open DimEvil opened 2 years ago

DimEvil commented 2 years ago

occurrence %<>% mutate(dwc_basisOfRecord = case_when ( datasetName == 'Lee' ~ "MaterialCitation" ,datasetName == 'Zarza' ~ "MaterialCitation" ,basisOfRecord == 'UNKNOWN' ~ "occurrence" ,datasetName == 'literature' ~ "MaterialCitation" ,datasetName == 'Piaggio' ~ "MaterialCitation" ,datasetName == 'Juan Luis_Personal_Database' ~ "MaterialCitation" ,datasetName == 'Literature' ~ "MaterialCitation" ,datasetName == 'Bio_Diversi_Data_UY' ~ "MaterialCitation" ,datasetName == 'PCMS' ~ "MaterialCitation" ,datasetName == 'Literature' ~ "MaterialCitation" ,datasetName == 'Streicker_Peru' ~ "MaterialCitation"
,datasetName == 'SAG' ~ "MaterialCitation" ,TRUE ~ basisOfRecord ))

pvandevuurst commented 2 years ago
DimEvil commented 2 years ago

Ok, fixed

ok

If the Lee paper data here (Desmodus_dataset_Dec_2021.csv) is not already on GBIF , we will keep them in the dataset. :)

More info on the term can be found here: https://github.com/tdwg/dwc/issues/329 IT's referring to evidence of an occurrence from literature.

ok

I think there is an error somewhere. Where family and taxonRank changed places. image

Also here is something not correct, as we have 100% Desmodus rotundus, family should be the same in the whole dataset.

image

fixed

I also changed M & F in Dwc:sex to male and female as these are the controlled vocabulary terms. :)

pvandevuurst commented 2 years ago

Sounds good. Looks like two of the columns might have been switched at some point (family and taxonRank). Perhaps try downloading the most recent version from Figshare (https://figshare.com/articles/dataset/Desmodus_rotundus_Occurrence_Record_Database/15025296). (i.e., Desmodus_dataset_Apr_2022).

DimEvil commented 2 years ago

Some more questions, remarks in April 22 version

I have these values in basisOfRecord MachineHUMAN_OBSERVATIONbPRESERVED_SPECIMENervatiHUMAN_OBSERVATIONn HumanHUMAN_OBSERVATIONbPRESERVED_SPECIMENervatiHUMAN_OBSERVATIONn

Still some errors in family <-> taxonRank (I created fixed terms here now)

image

All taxonrank is now species (as all scientificName = Desmodus rotundis (no subspecies)

pvandevuurst commented 2 years ago

Okay, gotcha. Just went through and fixed those columns, so they should be okay now. For taxonRank species is fine, just make sure that Desmodus rotundus is spelled correctly and that should be good to go!

pvandevuurst commented 2 years ago

For the basis of record it looks like some things have been collated. Should be HUMAN_OBSERVATION, PRESERVED_SPECIMEN, or LIVE_SPECIMEN. That specific record is unknown, so it should be left black or listed as UNKNOWN. Does that make sense?