Open alk224 opened 11 years ago
@adamrp wrote something that did the trick last year (https://gist.github.com/adamrp/7591573) but it still has not been added to qiime functionality. I pointed someone else requesting this feature on the forum to this script recently, so I think there continues to be interest. The script as is requires packages that are not present in the new biom and also only interprets json tables and can't do hdf5.
Right now when filtering otu tables to remove artifactual sequences, you can filter based on a global minimum count or percentage. This is problematic for most datasets which have different numbers of reads per sample since if filter at -n 100, it will be much more aggressive for a sample with just 800 reads versus one that has 10,000. If I expect a certain error rate from an Illumina run, say 0.1%, and I wish to filter at this level, I need a feature that will filter each sample based on this percentage (0.8 from the sample with 800 reads, 10 from the sample with 10,000 reads).