leppott / BioMonTools

Tools for biomonitoring and bioassessment; metric calculation for benthic macroinvertebrates, fish, and periphyton.
https://leppott.github.io/BioMonTools/
MIT License
13 stars 5 forks source link

taxa_translate - cleaning and matching on all upper case options #111

Closed leppott closed 3 months ago

leppott commented 4 months ago

Is your feature request related to a problem? Please describe. Add options to:

  1. perform some clean up in input names (e.g., strip white space and no breaking spaces [from ITIS]).
  2. match input and master table taxa on all upper case.

Describe the solution you'd like Solve some small issues with code so the translation table isn't miles long with lots and lots of variations of the same names with only minor differences.

Describe alternatives you've considered Sometimes the user doesn't know what needs to be done. Do it for them.

Additional context Add both as optional parameters with the not doing them as the default.

leppott commented 3 months ago

The "clean" parameter works for regular and non-standard white spaces. Have a test for it so it is working fine. Half done.

leppott commented 3 months ago

Too many issues with non-ASCII characters. Remove "match_caps" parameter.

Rename "clean" to "trim_ws".