STAR mapping EXITING because of FATAL ERROR in reads input: short read sequence line: 0 (Everything have been tried)

kaermadan commented 4 years ago

I got error message"EXITING because of FATAL ERROR in reads input: short read sequence line: 0 Read Name=@HWI-D00289:135:C4U3VACXX:3:2316:6629:26242 Read Sequence==== DEF_readNameLengthMax=50000 DEF_readSeqLengthMax=650

Jan 06 15:02:18 ...... FATAL ERROR, exiting"

I use star to map pair-end RNA-seq reads to the reference genome. One sample out of the total 110 sample has this problem. All others mapped fine. I have explored similar questions and tried the proposed solutions (such as https://github.com/alexdobin/STAR/issues/493), but still can not solve this problem.

STAR works ok when I mapped R1 and R2 file individually;
I tried mapping with both trimed and untrimed files;
I extracted the the reported line "@HWI-D00289:135:C4U3VACXX:3:2316:6629:26242" and the format looks fine;
I sort R1 R2 fastq file by name since there is one comment under similar questions said this problem could because the reads are not properly paired;
I also changed those parameters "--outFilterScoreMinOverLread 0.1 --outFilterMatchNminOverLread 0.1 --outFilterMismatchNmax 2".

But all these approaches did not work in pair-end mode. Can anybody provide any hints on this? Thanks!

kaermadan commented 4 years ago

Just to update, I remapped this sample today and there is no error come out, but the uniquely mapping rates in only 2%. Not sure why it works today, all the runnings I tried yesterday report the above error.

When I loose those parameters as "--outFilterScoreMinOverLread 0.3 --outFilterMatchNminOverLread 0.3 --outFilterMismatchNmax 2" instead of using the default. The uniquely mapping rates is now ~47%, still low compared with other samples under default parameter, which is 60-90%. This modification may at the cost of mapping quality, but it's better than nothing.

I'll keep this threads in case anyone encounter similar problem.

alexdobin commented 4 years ago

Hi @kaermadan

what was the mapping rate when you mapped R1 R2 individually, with default parameters? If it were high - comparable to other samples, it would indicate problems with pairing of the mates.

Cheers Alex

kaermadan commented 4 years ago

Hi @alexdobin Thank you for replying. Yes, the mapping rate was high when map R1, R2 individually with default parameters, ~94%, in contrast, uniquely mapping rate for other samples in pair-end mode are 60-93% (actually I found there are ~ one third samples of the total 110 reached over 90% mapping rate, so I think the R1, R2 individual mapping rate for this sample is not unexpected). As I have said at the original post, I have sorted the R1, R2 fastq files by name, and this didn't solve this problem.

alexdobin commented 4 years ago

Hi @kaermadan

if you mapped untrimmed files (i.e. as they come from the Illumina pipeline), and the PE mapping rate is 2%, and read1/2 individual mapping rates are >90%, then it's most likely a formatting problem. I would carefully check that there were no file swaps somewhere between Illumina output and mapping, e.g. you read1 file is the same as your reaq2 file. Then I would re-run Illumina pipeline to extract FASTQ from the BCL files, if possible.

If nothing helps, it's possible that there was some problem with Illumina sequencing and/or pipeline that caused the desynchronization of read1/read2 files. I have looked into one such case before, and there was nothing that could be done to salvage that data, unfortunately.

Cheers Alex

kaermadan commented 4 years ago

Thank you very much for the advices, I 'll check the format of the file s to see if this helps! @alexdobin

alexdobin / STAR

STAR mapping EXITING because of FATAL ERROR in reads input: short read sequence line: 0 (Everything have been tried) #803