TreesLab / NCLscan

We have developed a new pipeline, NCLscan, which is rather advantageous in the identification of both intragenic and intergenic "non-co-linear" (NCL) transcripts (fusion, trans-splicing, and circular RNA) from paired-end RNA-seq data.
MIT License
6 stars 9 forks source link

NCLscan finishes with an empty result file #21

Open HosseinAsghari opened 4 years ago

HosseinAsghari commented 4 years ago

Hi,

I am trying NCLscan on some paired-end data and the run finishes with the following message:

The result will be written to out.result See out.result.sam for the final alignment result.

However, "out.result" is an empty file while I expect to get many circRNAs.

At the end of the standard error output I got a bunch of "Format Error!" and a few "Format Error! Not paired-end!".
Here is the list of output files: output.list.txt

I was wondering what would be the reason of this issue.

Thanks in advance, Hossein

chiangtw commented 4 years ago

Hi,

Please provide the following information:

thanks, tw

HosseinAsghari commented 4 years ago

Hi,

Log files: ncls.log.txt ncls.out.txt

Hossein

chiangtw commented 4 years ago

Hi,

If you don't mind, please send me the file "all.out.JS2.sam" in the NCLscan output directory.

And could you try the test dataset mentioned in the README? I was wondering if it could be run successfully.

the following is the link to the test dataset:

Thanks, tw

HosseinAsghari commented 4 years ago

Hi again and thanks for the follow up.

Here is the file you requested.

all.out.JS2.sam.zip

And yes, I have tested the sample dataset. It runs without any problems and produces results.

Hossein

chiangtw commented 4 years ago

Hi,

This issue should be due to the behavior changes of novoalign V4.

The option "--pechimera", which is used to enable the use of supplementary alignments when one read of a pair is chimeric, changed its default value to "on".

This change would break the NCLscan pipeline.

The following was found in the file "all.out.JS2.sam",

simulate:8826   99  simulate:8847.0 89  0   63S38M  =   275 287 TTCGGCCTCTACCCAAAGTGAAAAGACTGCTGTCAGATAGCACTTGCCTTCCCCATATTATTCAGAAAAATGTTAAGGATCTCATGGATTCAAATGGAATA   55555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555   PG:Z:novoalign  AS:i:0  UQ:i:0  NM:i:0  MD:Z:38 PQ:i:8  SM:i:0  AM:i:0  ZS:Z:R  NH:i:10 HI:i:1  IH:i:1  Z3:i:126    SA:Z:simulate:8847.6,931,+,63M38S,3,0;
simulate:8826   147 simulate:8847.0 275 0   101M    =   89  -287    GTGGCCCGTATGGATTTCATGAGATGCAAGAATTGTGGACCAAAGGAATGTTAAATGCAAAAACCAGATGCTGGGCTCAAGGCATGGATGGATGGCGACCA   55555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555   PG:Z:novoalign  AS:i:0  UQ:i:0  NM:i:0  MD:Z:101    PQ:i:8  SM:i:0  AM:i:0  ZS:Z:R  NH:i:10 HI:i:1  IH:i:1  Z3:i:375
simulate:8826   2145    simulate:8847.6 931 3   63M38S  simulate:8847.0 275 0   TTCGGCCTCTACCCAAAGTGAAAAGACTGCTGTCAGATAGCACTTGCCTTCCCCATATTATTCAGAAAAATGTTAAGGATCTCATGGATTCAAATGGAATA   55555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555   PG:Z:novoalign  AS:i:0  UQ:i:0  NM:i:0  MD:Z:63 ZS:Z:R  NH:i:10 HI:i:1  IH:i:1  Z3:i:993    SA:Z:simulate:8847.0,89,+,63S38M,0,0;

The parser in NCLscan pipeline needs these alignments to be ordered by pairs, but these additional rows would break it.

--

For now, you could just use novoalign V3 to avoid this problem.

And we should fix this in the future.

Thanks! tw