Closed see24 closed 5 months ago
It seems like the zerofill = TRUE
option is not working very well in general. It is not filtering out anything for me. I think it is because there are columns in my data (like scientific_name) that are not removed in the select statement before unique so they are all being kept and then added back. It might work better to do dplyr::distinct on the columns that should uniquely id a visit.
In the data wrangling vignette it recommends to first do an inner join between your data and the species table downloaded with
wt_get_species
and then to dowt_tidy_species
to remove species that are not of interest. In fact this is required because if you don't have aspecies_class
column in the data passed towt_tidy_species
you get an error.But then when you run
wt_tidy_species
it runswt_get_species
internally and re-downloads the same table. This isn't a big deal because it doesn't take long, but seems unnecessary.It looks like the error above is just because if
zerofill = TRUE
there is a line to remove the species_class column which gives an error if it is not present. If that select statement was changed to select the desired columns or useselect(-any_of(c("species_code", "species_class", ...)
then there would be no need to download the species table and join it before runningwt_tidy_species
which perhaps is the intended process. If that's the case you could update the vignette since the join to the species table is not really necessary.