Closed sklages closed 6 years ago
It is so simple ... I just needed to extend the allowed fastq extension settings:
ALLOWED_FASTQ = ['.fq','.fastq.gz','fq.gz']
.
So Not valid Read FASTQ File, ill-formatted Index sequences
refers to the filename not the content of the files!
You should really work on more meaningful error messages ... :unamused:
Thank you for reporting this @sklages!
I have a RRBS dataset from two different runs with different run lengths, SR50, SR75. Index+UMI = 6+6. Run data has been demultiplexed so that the UMI is located in R2 as a read (
'i6y6n'
)Data from both runs have been merged, both R1 files, both R2 files.
The merged R1 file has been mapped to mm9 using
bsmap
and should be deduplicated withnugentechnologies-nudup-468c62e/nudup.py
.The error I get is:
The actual error results from erroneously looking in the header for the UMI .. not very helpful error messages :-(
So there is probably something I missed .. it works fine with unmerged data. Any idea where to start looking for the problem?