genomicsITER / NanoCLUST

NanoCLUST is an analysis pipeline for UMAP-based classification of amplicon-based full-length 16S rRNA nanopore reads

MIT License

106 stars 49 forks source link

Hi,

Thank you for opening the issue and sorry for the late response. We have recently seen some users with different issues in that step. Your suggestion is pretty ok for those cases with "nan" tax_ids. We don't have any data available to test the pipeline with that condition but we believe that assigning 'root' node to 'nan' tax_ids could be the right choice. Anyway, we always reccommend to check the .nanoclust.out file with the original top BLAST assignments for each cluster instead of the .csvs and plots generated later by the python scripts to better inspect pipeline results.

We have added your get_abundance.py edit to the main branch along with some other changes in that file to avoid tax_ID-to-name issues. Thank you for your contribution and feel free to open an issue again if something is not working!

genomicsITER / NanoCLUST

Taxid output "Nan" #22

Hi,