This PR solves #43 and includes regional distributions in unified checklist.
Most of the changes are in 5_unify_information.Rmd. Minor changes in 6_dwc_mapping.Rmd.
Here below the new workflow, in bold the steps added, in italic the steps modified:
Parse temporal (eventDate) information.
Filter distributions: this was already done in \@ref(filter-on-distribution)
Map locality and locationId to regional or national level. (see table in #43)
Add a Belgian distribution from regional distributions within a checklist if not present.
Choose a single distribution within a checklist for each location. Partly changed by adding locality and locationId to group_by())
Choose a single distribution across checklists. Partly changed by adding locality and locationId to group_by())
Save to CSV.
In DWC mapping, only minor changes applied:
distribution %<>% mutate(dwc_locationID = locationId) instead of distribution %<>% mutate(dwc_locationID = "ISO_3166-2:BE)
distribution %<>% mutate(dwc_locality = locality) instead of distribution %<>% mutate(dwc_locality = "Belgium")
To avoid massive amount of warnings while transforming Inf/-Inf to integer, I split the mutate call to calculate startYear and endYear within checklists in two steps. The change has no influence on results, but it improve code and speed as no warnings have to be returned.
I applied to the two Rmd files the commando styler::style_file() as last commit. I advice to use it on all other mapping steps as well.
This PR solves #43 and includes regional distributions in unified checklist.
Most of the changes are in
5_unify_information.Rmd
. Minor changes in6_dwc_mapping.Rmd
.Here below the new workflow, in bold the steps added, in italic the steps modified:
temporal
(eventDate) information.locality
andlocationId
to regional or national level. (see table in #43)locality
andlocationId
togroup_by()
)locality
andlocationId
togroup_by()
)In DWC mapping, only minor changes applied:
distribution %<>% mutate(dwc_locationID = locationId)
instead ofdistribution %<>% mutate(dwc_locationID = "ISO_3166-2:BE)
distribution %<>% mutate(dwc_locality = locality)
instead ofdistribution %<>% mutate(dwc_locality = "Belgium")
To avoid massive amount of warnings while transforming
Inf
/-Inf
to integer, I split themutate
call to calculatestartYear
andendYear
within checklists in two steps. The change has no influence on results, but it improve code and speed as no warnings have to be returned.I applied to the two Rmd files the commando
styler::style_file()
as last commit. I advice to use it on all other mapping steps as well.