dieterich-lab / JACUSA2

New version of JACUSA -> 2.0
GNU General Public License v3.0
23 stars 3 forks source link

chi square test for likelihood ratio? #54

Open fishercera opened 2 years ago

fishercera commented 2 years ago

Hi Dieterich Lab,

I love the tool. It's everything I wanted.

I'm trying to figure out how many degrees of freedom are in the model for the log-likelihood ratio, because I'd like to be able to put a p-value to a threshold to use for calling a site "edited." I've gone through your publication and I think that it's 4 -- one parameter for each base. But, I'm not sure, and I'm just not good enough at statistics to be able to understand the model well enough.

Thanks for any guidance you can provide.

piechottam commented 2 years ago

Hi,

check:

java -jar call-2 -h

this gives:

[...]
-u <MODE>             Choose between different modes (Default: DirMult):
                       DirMult Compound Error (estimated error {0.01} + phred score)
                       |       Adjusts variant condition
                       | :epsilon       Fit achieved if |L1 - L2| < epsilon, where L1 and L2 correspond to old
                       |                and new likelihood respectively.
                       |                Default: 0.001
                       | :maxIterations Maximum number of iterations for Newton's method.
                       |                Default: 100
                       | :calcPvalue    Calculate a pvalue based on a chi^2 approximation of the likelihood
                       |                ratio
                       | :showAlpha     Show detailed info of Newton's method in output (not in VCF output).
[...]

Adjust your JACUSA2 call / add: [...] call-2 -u DirMult:calcPvalue [...]

Notes:

fishercera commented 2 years ago

Thank you so much! And, yes, I totally understand why you'd prefer to work with the raw LRs. Looking at the distribution of LRs was a lot more informative than a p-value would be. I still have to be able to calculate a p-value for reviewers who are even less stats savvy than I am (at least I know what a Dirichlet multinomial is)...