Xinglab / espresso

Other
58 stars 4 forks source link

Thread 1 terminated abnormally for ESPRESSO_C.pl #34

Closed MSQ-123 closed 1 year ago

MSQ-123 commented 1 year ago

This is my command: (I have three samples, -X 1 represents the second sample) perl espresso_v_1_3_2/src/ESPRESSO_C.pl -I test_ChAR -F mm10.fa -X 1 -T 2
Previous step: perl ./espresso_v_1_3_2/src/ESPRESSO_S.pl -L samples.tsv -F mm10.fa -A gencode.vM22.annotation.gtf -O test_ChAR/

Thread 1 terminated abnormally: substr outside of string at espresso_v_1_3_2/src/ESPRESSO_C.pl line 774. Worker 2 begins to scan sam.list3ab. Terminating worker 2. Signal TERM received in thread 2, but no signal handler set. at espresso_v_1_3_2/src/ESPRESSO_C.pl line 2033. Perl exited with active threads: 1 running and unjoined 0 finished and unjoined 0 running and detached

It still didn't work on my mac M1, how could I solve this? Thank you! @EricKutschera

vetmohit89 commented 1 year ago

Hello Eric,

I am also having similar issue

Thread 1 terminated abnormally: substr outside of string at ESPRESSO_C.pl line 774. Signal TERM received in thread 5, but no signal handler set. at ESPRESSO_C.pl line 2033. Perl exited with active threads: 4 running and unjoined 0 finished and unjoined 0 running and detached

EricKutschera commented 1 year ago

That looks like the same error as https://github.com/Xinglab/espresso/issues/17

That would happen if there is a read in the input alignment file that doesn't have the full sequence. You could try filtering your alignment files with samtools view -F 0x900 to remove secondary and supplementary alignments. You can also try running the code from this branch which should output the specific read causing the error: https://github.com/Xinglab/espresso/compare/kutscherae-read-id-error-print

MSQ-123 commented 1 year ago

@EricKutschera I retried, and I found this: error with read_ID: 95:586|b95991bf-4f27-4539-840c-fd1509efb7a8, read_seq4blast: short Thread 1 terminated abnormally: substr outside of string at espresso_v_1_3_2/src/ESPRESSO_C.pl line 775. Worker 2 begins to scan sam.list3ab. Terminating worker 2. Signal TERM received in thread 2, but no signal handler set. at espresso_v_1_3_2/src/ESPRESSO_C.pl line 2047.

How could I deal with this? It seems like the read is too short

EricKutschera commented 1 year ago

Can you find that read in your input file and post it. I expect that it is either a secondary or supplementary alignment and would be filtered out by samtools view -F 0x900: https://samtools.github.io/hts-specs/SAMv1.pdf

For each read/contig in a SAM file, it is required that one and only one line associated with the read satisfies 'FLAG & 0x900 == 0'. This line is called the primary line of the read.

The non-primary alignments might not have the full read sequence and if ESPRESSO cannot find the full sequence for a read then it will say it is "short"

EricKutschera commented 1 year ago

Here's a change so that ESPRESSO will filter out reads without a full sequence in the S step rather than give an error in the C step: https://github.com/Xinglab/espresso/pull/35