IRFinder differential analysis

bmb13 commented 2 years ago

Dear Claudio

I have used IRFinder v1.3.0 in the past. Using the R script provided by the other Github wiki of IRFinder (https://github.com/williamritchie/IRFinder/wiki/Generalized-Linear-Model-with-Replicates) to do my differential analysis, I was able to obtain IRratio of the two groups I am trying to compare (that is not the IRratio provided during 'Quantify IR' step nor is it mean IRratio of the samples). And at the end, an IR.change value was generated for each retained intron. Here is a snippet of the code that I am referring to:

res.WT = results(dds, name = "ConditionWT.IRFinderIR")
# This tests if the number of IR reads are significantly different from normal spliced reads, in the WT samples.
# We might only be interested in the "log2FoldChange" column, instead of the significance.
# This is because "log2FoldChange" represents log2(number of intronic reads/number of normal spliced reads).
# So we the value of (intronic reads/normal spliced reads) by

WT.IR_vs_Splice=2^res.WT$log2FoldChange

# As IR ratio is calculated as (intronic reads/(intronic reads+normal spliced reads))
# We can easily convert the above value to IR ratio by

IRratio.WT = WT.IR_vs_Splice/(1+WT.IR_vs_Splice)

# Similarly, we can get IR ratio in the KO samples
res.KO = results(dds, name = "ConditionKO.IRFinderIR")
KO.IR_vs_Splice=2^res.KO$log2FoldChange
IRratio.KO = KO.IR_vs_Splice/(1+KO.IR_vs_Splice)

# Finally we can test the difference of (intronic reads/normal spliced reads) ratio between WT and KO
res.diff = results(dds, contrast=list("ConditionKO.IRFinderIR","ConditionWT.IRFinderIR"))

# We can plot the changes of IR ratio with p values
# In this example we defined significant IR changes as
# 1) IR changes no less than 10% (both direction) and 
# 2) with adjusted p values less than 0.05

IR.change = IRratio.KO - IRratio.WT

I am wondering if there is a parameter I can use in IRFinder Diff to obtain these values (IRratio.KO, IRratio.WT, IR.change, res.WT, res.KO) as well? Otherwise, may I know if you could consider adding these extra outputs?

Could I also check if DESeq2 results generated by 'IRFinder Diff' in IRFinder-S is equivalent to values in 'res.diff' generated by the R script? Ultimately, I would like to know the differences in results when doing differential analysis in IRFinder and IRFinder-S.

Thank you very much.

Anna

CloXD commented 2 years ago

Dear Anna, Thank you for your interesting point, I'll make those values available as soon as I have time to work on that. Meanwhile, you can use the DESeq2 constructor placed in the exact location as before ( bin/DESeq2Constructor.R ) and the same snippets. You can find the code R used to generate the DESeq2 results in https://github.com/RitchieLabIGH/IRFinder/blob/main/bin/util/deseq2.R For what concerns the differences between the two versions, in the first version of IRFinder all the introns were used in the constructor, even the ones in transcripts not expressed ( with the warning "LowCoverage" ), meanwhile in the current version the introns with low IRratio and the selected warnings ( -ir and -wl option of the Diff command, respectively) are excluded. The different total number of intron tested impacts the p-value adjustment. Cheers, Claudio

bmb13 commented 2 years ago

Dear Claudio

Thank you for getting back to me.

Regarding what you have said that introns with low IRratio and selected warnings being excluded, may I know more details?

As an example, if I have provided 10WT and 10KO 'IRFinder-IR-dir.txt' files for differential analysis and 3 out of 10 WT files for intron 'x' have low IRratio/warnings which lead to their exclusion, will the differential analysis for intron 'x' be between 7WT and 10KO instead 10WT vs 10KO?

Does the exclusion mean that the differential analysis of each intron will be between a different number of WT and KO (instead of 10WT vs 10KO) as different number of samples will be excluded for each intron?

Thank you!

Anna

CloXD commented 2 years ago

Dear Anna, sure: the IRratio filter works excluding introns in which none of the samples has a ratio higher than the user-definable threshold ( default 0.05 ). Therefore, your intron x is not filtered and all the samples are used in the DE analysis. This filter is thought to exclude those introns in which none of the samples shows any sign of intron retention.

For what concerns the warning filter, it removes the introns having no expression of the hosting gene in one or more samples or whose evidence of splice is too low ( this is the default behaviour, but you can disable it with the -wl 0 argument). In the help, you can find a summary of what I just described. Let me know if it's clear or if you need more information. Cheers Claudio

~$ IRFinder Diff -h 

IRFinder version: 2.0.0
Run Mode Diff: find differential intron retention events using DESeq2 or SUPPA2 algorithm.

Usage: IRFinder Diff -g:Group1 ./Group1/Sample*/IRFinder-IR-nondir.txt -g:Group2 ./Group2/Sample*/IRFinder-IR-nondir.txt [-g:Group3 ...]

  required:
    -g:GroupName : IRfinder's results ( IRFinder-IR-[non]dir.txt ) for each GroupName.

  optional:
    -ir Minimum IR ratio met in at least one sample to consider the intron. Default 0.05
    -wl Warning Level: if at least one sample has a warning with level equal or higher than the
                       one required, the intron is not considered:
         0: Keep all the introns 
         1: Discard the LowCover 
         2: Discard the LowCover and LowSplicing  ( Default ) 
         3: Discard the LowCover, LowSplicing and MinorIsoform
         4: Discard the LowCover, LowSplicing, MinorIsoform and NonUniformIntronCover
    -m 'deseq' if you want to use DESeq2, otherwise SUPPA2 method ( [empirical|suppa] or classical ). Default: deseq 
    -o The output folder. Defualt: ./diff/ 
    -v Verbose.

  DESeq2 options
    -c Use cooksCutoff
    -i Use independentFiltering
  SUPPA2 options
Additional arguments will be passed to SUPPA2 diffSplice. You can see their details using
  suppa.py diffSplice --help 

The arguments -p, -e and -i will automatically generated, don't give them as input.

bmb13 commented 2 years ago

Dear Claudio Thank you for answering my queries!

Anna

RitchieLabIGH / IRFinder

IRFinder differential analysis #16