Closed chassenr closed 7 years ago
Christiane, Sorry for the delayed reply, I missed the email notification for this thread. There is not currently an option to limit to unique sequences, however, pplacer does this automatically during phylogenetic placement (the most time-intensive part of the paprica pipeline). If you have too many reads to reasonably run paprica on the available resources you could try the paprica Amazon machine image. This will allow you to run paprica on a virtual computer that might have more cores than your physical computer. Let me know if you need more details.
Cheers, Jeff
Hi, I have a very large 16S dataset, and I was wondering if there is an option to use PAPRICA only with unique 16S sequences to speed up the alignment step in the beginning? If I understand the logs correctly after the cmalign the program continues with unique sequences anyway... Of course, PAPRICA will work with unique 16S sequences now, but as far as I could see, there is no option to take sequence counts (abundance) of a non-redundant dataset into account when calculating the metabolic profile.
Thanks!
Cheers, Christiane