Currently, samples with very few reads cause issues in certain steps of the pipeline.
Examples:
Humann2; low read count samples cause error on creation of biom file, exits pipeline
Mash; very low read count samples have insufficient sequence length to generate N kmers, fails on mash rule
DM calculation, for Mash or Humann2: rarefaction problem; sample depth obscures other trends.
Currently I've been handling by redoing computation with a config file where low-read samples are commented out. Would be preferable to do this algorithmically in script.
Currently, samples with very few reads cause issues in certain steps of the pipeline.
Examples:
Currently I've been handling by redoing computation with a config file where low-read samples are commented out. Would be preferable to do this algorithmically in script.