Where are Total number of all support reads, Total number of junc-reads, Total number of span-reads in .result file

TreesLab / NCLscan

We have developed a new pipeline, NCLscan, which is rather advantageous in the identification of both intragenic and intergenic "non-co-linear" (NCL) transcripts (fusion, trans-splicing, and circular RNA) from paired-end RNA-seq data.

MIT License

6 stars 9 forks source link

Where are Total number of all support reads, Total number of junc-reads, Total number of span-reads in .result file #5

Closed NitinMandloi closed 5 years ago

NitinMandloi commented 8 years ago

I am using NCLscan for my RNA seq data. I am able to run the full pipeline. I got my sam and tab delimited result file. I am unable to see Total number of all support reads, Total number of junc-reads, Total number of span-reads in .result file.

Also if I want to import the sam file to IGV to just validate the results. Can I use result.sam file to IGV?

Kindly help me regarding the same.

Thank you

Regards

chiangtw commented 8 years ago

Please provide me more information, such as the version of NCLscan you use, ... or you can try the "NCL_Scan5.py" in the latest version of NCLscan, put it in your project directory, and then run

    > ./NCL_Scan5.py

and you can see if it works.

And about the file "result.sam", since it is the alignment of the reads mapped to the 'junction pseudo-reference', not reference genome, so it might be not possible to import "result.sam" to IGV directly.

Thanks.

NitinMandloi commented 8 years ago

I am using NCLscan_v1.3.

Thank you

chiangtw commented 8 years ago

Ok, Please try the new version of NCLscan! The "total number of support reads" was provided since NCLscan v1.4.

Thanks! :-)

NitinMandloi commented 8 years ago

Ok, So I should put all the files from V1.5 to V1.3 and then rerun the whole analysis???

Thank you

chiangtw commented 8 years ago

You may just copy two new scripts "Add_read_count.py" and "utils.py" into your project directory, and run the following command:

> ./Add_read_count.py -pj [Project_name] -tmp [Your_original_result_file] -sam [The_result_sam_file] -o [Output_file_name]

And the last three columns of the output would be the total numbers of support/junc/span reads.

Thanks.

p.s. The argument "[Your_original_result_file]" described above actually need to input the ".result.tmp" file, sorry for the unclearness!

NitinMandloi commented 8 years ago

After running the full pipeline I got fusions where no of junction reads are 1 and total no of span reads are 0. Can I report this results for my sample. Here only 1 read is supporting these fusions.

Example:

Chr Junction Coordinate Strand Gene Chr Junction Coordinate Strand Gene Intragenic (1) or intergenic (0) case Total number of all support reads Total number of junc-reads Total number of span-reads chr17 76415806 + PGS1 chr17 76411033 + PGS1 1 1 1 0 chr1 3431966 - MEGF6 chr1 36931801 - CSF3R 0 1 1 0 chr14 88471566 + GPR65 chr21 34918519 + SON 0 1 1 0 chr8 143572176 + BAI1 chr18 72914295 - ZADH2 0 1 1 0 chr9 139559237 + EGFL7 chr17 42299830 + RP5-882C2.2 0 1 1 0 chr17 29231084 - TEFM chr17 42295664 - UBTF 0 1 1 0 chr2 71766369 + DYSF chr15 45007621 + B2M 0 1 1 0

Kindly help me regarding this.

Thank you

NitinMandloi commented 8 years ago

Kindly reply...

NitinMandloi commented 8 years ago

Apart from this I would like to see my results in IGV any successions regarding that...

chiangtw commented 8 years ago

Hi,

NCLscan reports low false positive, so it depends on your project. :-)

Hi

After running the full pipeline I got fusions where no of junction reads are 1 and total no of span >reads are 0. Can I report this results for my sample. Here only 1 read is supporting these fusions.

Example:

Chr Junction Coordinate Strand Gene Chr Junction Coordinate Strand Gene Intragenic (1) or >intergenic (0) case Total number of all support reads Total number of junc-reads Total number of >span-reads chr17 76415806 + PGS1 chr17 76411033 + PGS1 1 1 1 0 chr1 3431966 - MEGF6 chr1 36931801 - CSF3R 0 1 1 0 chr14 88471566 + GPR65 chr21 34918519 + SON 0 1 1 0 chr8 143572176 + BAI1 chr18 72914295 - ZADH2 0 1 1 0 chr9 139559237 + EGFL7 chr17 42299830 + RP5-882C2.2 0 1 1 0 chr17 29231084 - TEFM chr17 42295664 - UBTF 0 1 1 0 chr2 71766369 + DYSF chr15 45007621 + B2M 0 1 1 0

Kindly help me regarding this.

Thank you

chiangtw commented 8 years ago

About the visualization, there are still no simple way to do so, but you can get the IDs of support reads from the ".result.sam" by using the event ID in ".result.tmp" as the key, and then you may use such IDs to get the original reads.

Apart from this I would like to see my results in IGV any successions regarding that...

NitinMandloi commented 8 years ago

I am running the updated version which is V1.6. I am getting following error:

Error: The requested bed file (/MGMSTAR1/SHARED/ANALYSIS/TRAIL/NCLscan_v1.3/V1.6//S12.preJS.bed) could not be opened. Exiting! Error: /MGMSTAR1/SHARED/ANALYSIS/TRAIL/NCLscan_v1.3/V1.6//S12.JS.ndx does not appear to be a valid novoindex. Code 9 Error: /MGMSTAR1/SHARED/ANALYSIS/TRAIL/NCLscan_v1.3/V1.6//S12.JS.ndx does not appear to be a valid novoindex. Code 9 ValueError: zero length field name in format ValueError: zero length field name in format Error: The requested bed file (/MGMSTAR1/SHARED/ANALYSIS/TRAIL/NCLscan_v1.3/V1.6//S12.PreJS2.bed) could not be opened. Exiting! Error: /MGMSTAR1/SHARED/ANALYSIS/TRAIL/NCLscan_v1.3/V1.6//S12.JS2.ndx does not appear to be a valid novoindex. Code 9 ValueError: zero length field name in format ValueError: zero length field name in format ValueError: zero length field name in format ValueError: zero length field name in format IndexError: list index out of range IOError: [Errno 2] No such file or directory: '/MGMSTAR1/SHARED/ANALYSIS/TRAIL/NCLscan_v1.3/V1.6//S12.result.tmp2' IOError: [Errno 2] No such file or directory: '/MGMSTAR1/SHARED/ANALYSIS/TRAIL/NCLscan_v1.3/V1.6//S12.result.tmp3'

Kindly help me resolving this.

Thank you

Regards

chiangtw commented 8 years ago

Hi,

Please check the "S12.preJS.bed" first, is it generated successfully?

NitinMandloi commented 8 years ago

Hi...

Yes that file is empty!!!!

NitinMandloi commented 8 years ago

Hi Please find the attached nohup file.

Kindly help to resolve the bug.

Thank you

Regards nohup.txt

NitinMandloi commented 8 years ago

Plase find the complete nohup file.

Please ignore previous file.

Thank you nohup.txt

chiangtw commented 8 years ago

It seems that the Python in your system is not 2.7.

NitinMandloi commented 8 years ago

Yes this was the problem.

If we are getting total supporting reads as 1 or 2 or 4 then will it be false positive??

What should be minimum cutoff to accept a true fusion here???

chiangtw commented 8 years ago

In general, 2 is an acceptable cutoff for the total supporting reads, but it still depends on your project!

Yes this was the problem.

If we are getting total supporting reads as 1 or 2 or 4 then will it be false positive??

What should be minimum cutoff to accept a true fusion here???

flyman0302 commented 8 years ago

the file "2481_TR.JS.seq12" is empty. What is wrong? I have run the NCLscan v1.6 under linux (centos 7) on a 64-bit machine with 32 CPU cores and 512GB RAM to analysis human cancer RNA-seq paired-end data. The python is 2.7.11. The attachment is the full detail of result files. ncl_log.txt