jsh58 / NGmerge

Merging paired-end reads and removing adapters
MIT License
44 stars 15 forks source link

Reads are good but throws error: "Sequence/quality scores do not match" #12

Closed pjaborges closed 4 years ago

pjaborges commented 4 years ago

I try GNmerge in Linux but is not running with my simulated datasets.

The header patterns are the following:

@gi|110798562|ref|NC_008261.1|-100.101.325660/1
AAGTTCATCATAGTTATTTTGAATAAAATTTAATCTATCAAGTATCATCTATTATCACTCCGTATACAGATTTTCATATTTTACAATTATAGCACACTAC
+
>G9GFGCFGGGG#G#8#G)E##G6GGGBGGGGCGGGGGEFGG8GG:CGGF9,G9EGGGFGGGGGG6GGGGFGGGGCGGGGFGGGGGGGGGGGGGGFGGCG
@gi|110798562|ref|NC_008261.1|-100.101.325660/2
TAGTAGTGGGCTCTCTTTGTAAAATATAAACATCCGTATACGGAGTGATAATAGATTATACTTGATAGATTAAATTTTATTGAAAATAAATATGATGAAC
+
C2C*G*5)(*@4(G##:GGGF4G3,*D*#G#G(G#G*E05GGGGGG+.E+*5DGFG*4G8G1G+G+*GG87CGGCFGEG0FGCGFGGG+GGGGGGGGGGF

My version seems to expect a " " as delimiter to create a single key. Thus, I was getting the error : ..... ": not matched in input files" I add a " " before the "/" and it solve the issue. I, notice after that a new parameter (-t) was added to handle these situations.

After, another error prompted: "Sequence/quality scores do not match". This is thrown because of "ERRQUAL". The reads do not have any issue and I have been able to run the datasets with many other tools (BBMerge, USEARCH, FLASH, PEAR, etc...)

I am sharing a small dataset, in case you want to investigate what could be the problem? reads_NC_008261.1.100.101.10_R1.fq.gz reads_NC_008261.1.100.101.10_R2.fq.gz

Thanks

jsh58 commented 4 years ago

Thanks for the question, and for including the dataset.

Unfortunately, I cannot reproduce the error. Here is the command and verbose output:

$ ./NGmerge \
  -1 reads_NC_008261.1.100.101.10_R1.fq.gz \
  -2 reads_NC_008261.1.100.101.10_R2.fq.gz \
  -o merged \
  -t '/' \
  -v
Processing files: reads_NC_008261.1.100.101.10_R1.fq.gz,reads_NC_008261.1.100.101.10_R2.fq.gz
  Fragments (pairs of reads) analyzed: 162830
  Successfully stitched: 126562

As you can see, using -t '/' solves the header issue. I wonder if, when you were dealing with the header issue, you substituted ' ' before every /, thus possibly introducing spaces in some of the quality score strings.

pjaborges commented 4 years ago

Yes, you are correct. I introduced spaces in the quality score strings when doing the replace.

It worked perfectly! Thanks

Kaderi15 commented 3 years ago

Dear Sir I got the "Error! @ERR3385738.1.1: not matched in input files" when run NGmerge to merge sequences. Please help me, sir. Sir I have attached my sample sequence files herewith. Command: $ ./NGmerge -1 CDTC160004_R1.fastq.gz -2 CDTC160004_R2.fastq.gz -t ' ' -o merged -v

CDTC160075_S95_L001_R1_trim.fastq.gz

CDTC160075_S95_L001_R2_trim.fastq.gz

jsh58 commented 3 years ago

Please open a New Issue. This is unrelated to #12