mothur / mothur

Welcome to the mothur project, initiated by Dr. Patrick Schloss and his software development team in the Department of Microbiology & Immunology at The University of Michigan. This project seeks to develop a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community.
www.mothur.org
GNU General Public License v3.0
264 stars 110 forks source link

sequence mismatch after running screen.seqs #621

Closed cfortunato82 closed 5 years ago

cfortunato82 commented 5 years ago

Hi,

I'm following the MiSeq SOP to analyze some V4 amplicon data (using the AWS Mothur AMI, version 1.40.4) and running into a weird issue..

After combining my paired end reads I have 3515599 sequences in my fasta file. I then run screen.seqs with 0 ambiguous bases and max length of 275 bp. I get the following message:

"It took 14 secs to screen 3515599 sequences, removed 632017"

So that makes sense. But after running summary.seqs I realized there are only 86185 sequences in the stability.trim.contigs.good.fasta file. So I'm just confused as to what happened to the rest of the sequences. I would expect there should be 2883582 sequences given only 632017 were removed. Not sure if I'm just missing something super obvious or there is an actual issue. Thanks for the help!

Caroline

mothur-westcott commented 5 years ago

Could you have run out of disk space to write the new screened file? Are you running with multiple processors?

cfortunato82 commented 5 years ago

Hi,

Yes, running with 8 processors using the Mothur AMI on AWS. It could be that I ran out of my allowed storage space....