qqwang-berkeley / JUM

A tool for annotation-free differential analysis of tissue-specific pre-mRNA alternative splicing patterns
MIT License
28 stars 13 forks source link

filtering outputs from cutoff p-value=1 #42

Open avantika-insitro opened 2 years ago

avantika-insitro commented 2 years ago

I have run JUM twice on the same dataset, once with an adjusted p-value cutoff = 1 and once with an adjusted p-value cutoff = 0.05. I got the simplified and detailed results from both runs.

However, I find that I cannot reproduce the results from the padj=0.05 run simply by thresholding the padj=1 run.

For example, when I look at the AS_differential_JUM_output_intron_retention_adjusted_pvalue_1_final_simplified.txt output and threshold it by qvalue < .05, this returns 18 significant events that have q-value < .05. However, when I look at the AS_differential_JUM_output_intron_retention_adjusted_pvalue_0.05_final_simplified.txt output, it contains only 3 events, not 18.

Is this expected behavior?

qqwang-berkeley commented 1 year ago

Yes you set the p-value cutoff to different values and JUM results will be filtered differently according to your setting. So you will see different results. A p-value at 1 is basically give you the full alternative splicing events in your sample (may or may not change significantly in your condition v.s. control)

avantika-insitro commented 1 year ago

Thank you. I understand that I can set the p-value threshold and filter the outputs of padj_1 to padj 0.05. What I do not understand right now is why the results of this filtering are different from setting 0.05 as the p-value cutoff to JUM. Shouldn't the results obtained by passing padj 0.05 be identical to those obtained by setting padj 1 and subsequently filtering to 0.05?

qqwang-berkeley commented 1 year ago

if you first run JUM with padj_1, you should be able to retrieve the exact same results by filtering the raw results using any new threshold values v.s. running JUM with padj of the desired value. Be careful about the filtering step, as it is tricky - you are filtering by the usage of a sub junction within an AS event, but need to report the AS event. Could you show me how exactly you filter JUM results? I can also send you a downstream script that I use to filter JUM results from padj 1