trias-project / unified-checklist

🇧🇪 Global Register of Introduced and Invasive Species - Belgium
https://trias-project.github.io/unified-checklist/
MIT License
0 stars 1 forks source link

Is the column `issues` still a list? #71

Closed peterdesmet closed 1 year ago

peterdesmet commented 1 year ago

Tidylog reports no changes for this step:

https://github.com/trias-project/unified-checklist/blob/9d309ecb1bb0de3d87f79c43e7da4eeb640640ee/src/1_get_taxa.Rmd#L118-L125

Maybe this step can be removed?

PietrH commented 1 year ago

On my end issues is a character vector column:

> glimpse(taxa)
# Rows: 11,640
# Columns: 7
# $ taxonKey       <int> 141264581, 141264583, 141264585, 141264587, …
# $ scientificName <chr> "Nymphaea x marliacea Marliac", "Aucuba japo…
# $ taxonID        <chr> "alien-plants-belgium:taxon:f33d6d2dca86b53e…
# $ datasetKey     <chr> "9ff7d317-609b-4c08-bd86-3bc404b77c42", "9ff…
# $ nameType       <chr> "SCIENTIFIC", "SCIENTIFIC", "SCIENTIFIC", "S…
# $ issues         <chr> "rankinv", "", "", "", "", "", "", "", "", "…
# $ nubKey         <int> 7802465, 3033077, 5914287, 3172622, 2889934,…
> count(taxa,issues, sort = TRUE)
# count: now 17 rows and 2 columns, ungrouped
# # A tibble: 17 × 2
#    issues                      n
#    <chr>                   <int>
#  1 "scina"                  4971
#  2 ""                       4490
#  3 "bbmn,scina"              939
#  4 "disinv,scina"            754
#  5 "bbmn"                    265
#  6 "rankinv"                 132
#  7 "bbmn,rankinv"             19
#  8 "pp"                       17
#  9 "pnuidinv,scina"           15
# 10 "bbmn,pp"                  12
# 11 "bbmn,rankinv,scina"       10
# 12 "scina,vernnameinv"         5
# 13 "disinv,pnuidinv,scina"     4
# 14 "pnuidinv"                  4
# 15 "bbmn,pnuidinv,scina"       1
# 16 "desinv"                    1
# 17 "disinv"                    1

Should empty strings be NA?

peterdesmet commented 1 year ago

Ok, then we can remove that conversion step