Acanthiza / envClean

Clean biological data from large unstructured dataset(s)
https://acanthiza.github.io/envClean/
Other
0 stars 1 forks source link

`filter_taxa` loses taxa when `taxa_col` is taxa #14

Closed Acanthiza closed 2 months ago

Acanthiza commented 8 months ago

bio_taxa <- df %>% dplyr::distinct(original_name) %>% dplyr::left_join(taxonomy$lutaxa) %>% dplyr::left_join(taxonomy$taxonomy) %>% dplyr::filter(!is.na(taxa)) %>% dplyr::filter(rank <= required_rank) %>% dplyr::inner_join(df)

At this step, if taxa_col is taxa, any non-matching taxa are lost. Perhaps rename taxa_col in df to original_name?

Acanthiza commented 2 months ago

Closing as filter_taxa deprecated for bin_taxa