vastgroup / vast-tools

A toolset for profiling alternative splicing events in RNA-Seq data.
MIT License
77 stars 29 forks source link

Compare vs Diff #98

Closed juliana-chu closed 1 year ago

juliana-chu commented 3 years ago

Hello, there!

I have some question about the command compare and diff, and just want to clarify them. So from the ReadMe, compare returns a tab that includes differentially spliced AS events between two groups base on their average inclusion levels from replicates. Diff is differential AS based on each replicate? Since from my samples, my diff returns a longer tab than compare, and includes all events from compare. Does that means Diff's tab returns all events, with calculated MV? Also, there are some events returned in compare, but in diff it have MV at 0.95 as 0. Does that mean the event have different AS in two groups, but there is over 95% that the event is not going to happen in future?

Please clarify that for me!

Thank you!

mirimia commented 3 years ago

Hello,

The output of compare is only the events that pass the filters + the test (comparing PSI averages and the overlap between distributions). For diff (@UBrau please confirm) is only the events that have pass the filtering criteria (minimum read coverage), and then you need to select your MV cut-off.

It is possible that an event picked by compared has MV at 0.95 of 0. It means that statistically it's not likely to be a changing event, likely because there are few reads and the differences aren't great.

By using compare and/or diff the idea is to provide higher analysis flexibility.

I hope it helps! Best Manu

UBrau commented 3 years ago

Hi there,

diff returns differential splicing for all replicates together. It models a beta distribution on all replicates of one condition and another one to all replicates of the other condition and returns both the estimated means in both conditions, as well as the minimum PSI/PIR difference that is there with a likelihood of > 95% MV[abs(dPSI)]_at_0.95.

diff returns returns all events, so you should do your own filtering based on the quality columns of each replicate, and determine how many replicates you require to pass filtering (e.g., 2 out of 3). I would recommend using MV[abs(dPSI)] > 0 as evidence of differential splicing, and the difference in means as the actual PSI/PIR difference.

Ulrich