dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
70 stars 39 forks source link

Failed @ Step 2 #431

Closed Pgjhmb closed 3 years ago

Pgjhmb commented 3 years ago

Hello, 4 of my 21 libraries failed and I wanted to ask about some ways to troubleshoot it. These are my errors, 3/4 samples received the first error, and the 4th was the latter. Any help would be so appreciated, I am bumping my head against the walls

<ipyrad.Sample object 42-MaxAS_001> IPyradError( error in cutadapt -a AGATCGGAAGAGC --quality-base 33 -q 20 --minimum-length 35 --max-n 5 --trim-n --output(.....)

 b"This is cutadapt 3.1 with Python 3.8.5\nCommand line parameters: -a AGATCGGAAGAGC --quality-base 33 -q 20 --minimum-length 35 --max-n 5 --trim-n --output (....)\nProcessing reads on 1 core in single-end mode ...\ncutadapt: error: Error in FASTQ file at line 15392642: Premature end of file encountered. The incomplete final record was: '@M04880:108:000000000-J8RGJ:1:2108:14678:23488 1:N:0:GTTTCGGA+TCTTTCCC\\nGATCACACGAAAAACGCGCTGCTTAGGGCAGGGGTGCGACGGCACGTCTTAAGCGGACC'\n")

Then for another sample, I received this- of which I will try reuploading the file to make sure it's not corrupted.

<ipyrad.Sample object 196-MaxRS_001> IPyradError( error in cutadapt -a AGATCGGAAGAGC --quality-base 33 -q 20 --minimum-length 35 --max-n 5 --trim-n --output (...)

 b'This is cutadapt 3.1 with Python 3.8.5\nCommand line parameters: -a AGATCGGAAGAGC --quality-base 33 -q 20 --minimum-length 35 --max-n 5 --trim-n --output (....)\nProcessing reads on 1 core in single-end mode ...\ncutadapt: error: Error in FASTQ file at line 15093140: Length of sequence and qualities differ\n')
isaacovercast commented 3 years ago

Hello, these errors are both related to the fact that the input fastq files have been truncated or corrupted. This is 99% of the time caused by either corrupted sample fq.gz files or lack of sufficient disk space. This is almost always a disk space issue. Can you verify you have adequate disk space for the analysis? You should have at least 10 GB of disk per GB of raw sample data, ideally much more.

isaacovercast commented 3 years ago

I assume this was the problem and that it has been fixed. Please reopen this issue if that isn't the case.