ImperialCollegeLondon / safedata_validator

Python tools to validate and publish datasets using the safedata metadata format.
https://safedata-validator.readthedocs.io/
MIT License
2 stars 4 forks source link

Handling of incertae sedis output from genbank #174

Open jacobcook1995 opened 1 month ago

jacobcook1995 commented 1 month ago

This PR adds handling of output containing Incertae sedis in Genbank/DADA2 format. It's designed to fail if the first or last provided rank are Incertae sedis because in this case the taxonomic information is genuinely missing rather than referring to a real uncertainty.

The implementation is a little bit rough, let me know if you have ideas for cleaning it up

Fixes #172

codecov-commenter commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 77.97%. Comparing base (9324649) to head (7223808). Report is 84 commits behind head on develop.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #174 +/- ## =========================================== - Coverage 79.04% 77.97% -1.08% =========================================== Files 13 13 Lines 3741 3854 +113 =========================================== + Hits 2957 3005 +48 - Misses 784 849 +65 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

jacobcook1995 commented 1 month ago

Making this a draft as it is unlikely we will every publish anything with a NCBITaxa sheet, so a slight fix like this is unlikely to be useful. Will close and delete the branch once we have settled on a replacement system