dieterich-lab / JACUSA2

New version of JACUSA -> 2.0
GNU General Public License v3.0
23 stars 3 forks source link

Strandedness Issue -- opposite results for Reditools2 and JACUSA2 #78

Closed ekofman closed 1 month ago

ekofman commented 1 month ago

Hi,

I am running reditools2 and JACUSA2 on the same .bam file. To confirm the strand-specificity of my data, I used RSeQC, which provides a tool to infer the strand-specificity of RNA-seq data, on my .bam and it output:

>> infer_experiment.py -r reference.bed -i sample.bam

Reading reference gene model reference.bed ... Done
Loading SAM/BAM file ...  Total 200000 usable reads were sampled

This is SingleEnd Data
Fraction of reads failed to determine: 0.0471
Fraction of reads explained by "++,--": 0.0378
Fraction of reads explained by "+-,-+": 0.9152

Based on this, it seems that most of my reads are strand specific and follow the convention for second-strand sequencing (FR-SECONDSTRAND). Thus, I would expect to use the following:

For REDItools2: Use -s 2 to specify second-strand specificity.
For JACUSA2: Use -P FR-SECONDSTRAND to match the strand-specific setting for your data.

I am thus calling:

java -jar JACUSA_v2.0.4.jar call-1 -P FR-SECONDSTRAND -p 32 -q 30 -m 20 -r sample.edits.bed -f B sample.bam

However, I seem to be getting the exact opposite edits (reverse complement) as what I got using -s 2 in REDItools2... have there been any other issues with this? Is there a chance it is incorrect? It seems one of these two tools is incorrectly flipping strand. Unless there is another option as to why they would present opposite results...?

ekofman commented 1 month ago

sample_bam.zip

here is a subset of reads from the bam I running with ^

piechottam commented 1 month ago

Hi,

according to https://rseqc.sourceforge.net/#infer-experiment-py [...] +-,-+

read mapped to ‘+’ strand indicates parental gene on ‘-’ strand

read mapped to ‘-’ strand indicates parental gene on ‘+’ strand

[...] and the result: [...] Fraction of reads explained by "+-,-+": 0.9152 [...] Your library type would be: RF-FIRSTSTRAND -> use -P RF-FIRSTSTRAND in JACUSA2 and your problems should go away.

ekofman commented 1 month ago

Great, thank you for the clarification! Glad it was just user error.

On Thu, Oct 31, 2024 at 12:23 AM Michael Piechotta @.***> wrote:

Hi,

according to https://rseqc.sourceforge.net/#infer-experiment-py [...] +-,-+

read mapped to ‘+’ strand indicates parental gene on ‘-’ strand

read mapped to ‘-’ strand indicates parental gene on ‘+’ strand

[...] and the result: [...] Fraction of reads explained by "+-,-+": 0.9152 [...] Your library type would be: RF-FIRSTSTRAND -> use -P RF-FIRSTSTRAND in JACUSA2 and your problems should go away.

— Reply to this email directly, view it on GitHub https://github.com/dieterich-lab/JACUSA2/issues/78#issuecomment-2449193571, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJQLQQQJEFARIKEM6QR7VDZ6HLIXAVCNFSM6AAAAABQ5KZNSOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBZGE4TGNJXGE . You are receiving this because you authored the thread.Message ID: @.***>

piechottam commented 1 month ago

np.