Feedback to 1st presentation

This is a crude list of comments/ideas/etc. that came to my mind while looking at the first Baltica presentation:

Page 1: Title sounds good to me!
Page 2: "The analogy I have in mind is to associate the drastic consequences of the change in route to the changes in pathing in the splice graph"
- Honestly, this is the first time the name "Baltica" makes sense to me :) But seriously, I find it a nice analogy and a fitting name!
Page 2: I like the idea of integrating a transcript scheme in the logo, maybe one can "morph" it inside the baltic sea map? On the other hand, it should not be overcrowded in the end
Page 6: JunctionSeq is a mixed DEU/DTU method or am I mistaken? It depends on the "analysis.type"?
Page 6: We/you probably need to justify, why those three were chosen in the end for the majority voting. In case of Majiq vs. leafcutter, it is pretty easy since it is "exon-centric" vs. "intron-centric", but for JunctionSeq I find it hard to define what it really does (in easy words, negative binomial model is telling enough?). Does something come to your mind?
Page 8: At some point you had Whippet on that list as well, how was your experience with that tool?
Page 10: "Validate the sucess potential bias in RNA extraction, lib prep or sequencing"
- Honestly, I don't know what that means, what is a "success potential bias", if you don't mind me asking :)
Page 10: "QC as diagnose"
- I find this an important/helpful point, maybe it is worth to provide the user some kind of QC html file that gives an easy overview about the parameters you listed
Page 10: What does junction saturation mean? Would the quantification of junctions not be "open-end", meaning the more, the better? Also, what would the number of reads mapped to introns "tell us"?
Pages 12-14: Technical question, will Baltica go through all the steps of each program? E.g. generating the output plots? Would that be needed or could one skip those steps, since you anyway do the majority voting later on?
Page 17: Another technical question, how does Baltica deal with the different cut-offs the single tools use? For example, if I remember correctly leafcutter and Majiq calculate the dPSI differently.
Page 18: Why the 2nt difference? If I remember correctly the difference between the coordinates was normally rather 1nt or did I get this wrong?
Page 20: Wow, that is pretty devastating for Majiq, am I right? In retrospect and this probably makes everything super difficult, but have you checked how something like Whippet or the others would perform in this scenario?
Page 21: Many good points! Of course for us personally, the NMD-AS point is very interesting :)
Page 22: Again an honest feedback, I did not fully understand the sub-points you made concerning the "incompatibility among DJU results". What I get is that you try to figure out why the single tools differ so much and I fully agree that this is an (academically) very interesting point. Do you have any indications so far, what might be the reason?
Page 23: You would do the de novo annotation completely without reference (which is probably the best for "de novo")? However, the transcripts then have some cryptic names or would you still try to reconcile in the end, to which reference transcripts the ones you have match the best?

Small addition after talking to Niels about it:

Did you check how the SIRV Benchmark (Page 20) gets "better" with using Baltica? For the left plot (Recall vs. 1-specificity) leafcutter seems to be already almost perfect, only a small recall deficit (which might get better with Baltica?)
Along the same line, wouldn't it be the "ultimative" goal to use the best combination of tools or is this too simple since the SIRVs only give 69 transcripts variants?

Hi Volker,

Thank you VERY much for the helpful feedback, bellow my response

Page 1: Title sounds good to me!
We have to change it slightly to "Baltica: integrated differential junction usage and consequence analysis with an ensemble of methods" and subtitle to "Baltica: integrated differential junction usage analysis" Page 2: I like the idea of integrating a transcript scheme in the logo, maybe one can "morph" it inside the baltic sea map? On the other hand, it should not be overcrowded in the end Can you print the current logo and draw your idea over it and send me a photo? Our idea is to add the CAP and poly(a) tail and maybe a few other exons
Page 6: JunctionSeq is a mixed DEU/DTU method or am I mistaken? It depends on the "analysis.type"?
Yes, JunctionSeq can do DEU, but, to be fair, I find hard to distinguish JunctionSeq DEU and DEXSeq DEU, and If I want to do DEU I would probably use DEXSeq. We can discuss this further
Page 6: We/you probably need to justify, why those three were chosen in the end for the majority voting. In case of Majiq vs. leafcutter, it is pretty easy since it is "exon-centric" vs. "intron-centric", but for JunctionSeq I find it hard to define what it really does (in easy words, negative binomial model is telling enough?). Does something come to your mind?
I have the following criteria (from the manuscript): To select the supported methods, we used the following criteria: method should use as input read alignment in the BAM format; detect AS splicing as changes on SJ level, not transcript; provide test statistic that compares a control group to a treatment group; output effect size estimates, such as the deltapsi; and ability to detected unannotated SJ.
Page 8: At some point you had Whippet on that list as well, how was your experience with that tool?
I briefly tested it, I was ready to add it to Baltica, but there is a major difference: it does it own pseudo-aligment, and so it's harder to compare with other tools that use alignment as input. I saw there maybe a way to bypass the alignment step, but I didn't test it
Page 10: "Validate the success potential bias in RNA extraction, lib prep or sequencing"
    Honestly, I don't know what that means, what is a "success potential bias", if you don't mind me asking :)
I will re-write this slide, thanks for the feedback

This part is a bit problematic, because there is no fixed rules for RNA-Seq QC. The QC we usually do depends on many factors, such as lib prep and RNA provenance and the experience of the analyst

Page 10: "QC as diagnose"
    I find this an important/helpful point, maybe it is worth to provide the user some kind of QC html file that gives an easy overview about the parameters you listed

Page 10: What does junction saturation mean? Would the quantification of junctions not be "open-end", meaning the more, the better? Also, what would the number of reads mapped to introns "tell us"?
SJ saturation is in (RSEQC)[http://dldcc-web.brc.bcm.edu/lilab/liguow/CGI/rseqc/_build/html/#junction-saturation-py]. The idea is that you compare % of reads mapping to annotated SJ to % aligned to novel, you can understand the dataset limit on
find new splicing events.

If one group has increase / decrease % reads mapping to introns may indicate intron retention or accumulation of pre-mRNA, both which are worth of a follow-up

There are probably other uses for QC, but these are two I've experienced.

Pages 12-14: Technical question, will Baltica go through all the steps of each program? E.g. generating the output plots? Would that be needed or could one skip those steps, since you anyway do the majority voting later on?
In baltica I implemented the minimal workflow to obtain the text output. Right now only junctionseq outputs figures. But the files are still there if the users want to plot with majiq or leafcutter
Page 17: Another technical question, how does Baltica deal with the different cut-offs the single tools use? For example, if I remember correctly leafcutter and Majiq calculate the dPSI differently.
There is no away around this, the assumptions are very different and I would not even compare the dPSI if can. The problem is similar for leafcutter pvalue vs majiq posteriori probability, not very easy to compare those. What is comparable is the ranks, and we could do some analysis
Page 18: Why the 2nt difference? If I remember correctly the difference between the coordinates was normally rather 1nt or did I get this wrong?
1 nt for gff to bed, which 1 and 0-based coordinate systems 1 nt for introns to exons coordinates Doing this we can easily add other tools to Baltica, because as far as I know, this cover all the problems
Page 20: Wow, that is pretty devastating for Majiq, am I right? In retrospect and this probably makes everything super difficult, but have you checked how something like Whippet or the others would perform in this scenario?
I am revisiting this plot right now, it's terrible for Majiq I didn't test any other tool, but I consider supporting Whippet, Suppa2 and rMats
Page 21: Many good points! Of course for us personally, the NMD-AS point is very interesting :)
I think I can implement this
Page 22: Again an honest feedback, I did not fully understand the sub-points you made concerning the "incompatibility among DJU results". What I get is that you try to figure out why the single tools differ so much and I fully agree that this is an (academically) very interesting point. Do you have any indications so far, what might be the reason?
The results are incompatible, not the methods. Only 20% of the genes and 2% of the junctions (slide 7). This is very low, for DGE methods we see something in the order of 60% or more Why the results are different? I can't pinpoint one answer to this, but:

Different strategies to compute a SJ count matrix, for example LeafCutter discards read-alignments with insertions or deletions

Different pre-processing procedures, for example, Majiq correct for GC-content bias

Different test design and statistical models Is that convincing?

Page 23: You would do the de novo annotation completely without reference (which is probably the best for "de novo")? However, the transcripts then have some cryptic names or would you still try to reconcile in the end, to which reference transcripts the ones you have match the best? Is it denovo without reference the best denovo? I don't have a opinion about this, but I think having the reference helps with the specificity and sensitivity, if the model trascriptome is well built (human, mouse), but I would have to do some systematic analysis with the SIRVs to be correct

Did you check how the SIRV Benchmark (Page 20) gets "better" with using Baltica? For the left plot (Recall vs. 1-specificity) leafcutter seems to be already almost perfect, only a small recall deficit (which might get better with Baltica?)

Christoph also suggested that, I will do. I am very confident I wont make the recall better, but maybe the Specificity

Along the same line, wouldn't it be the "ultimative" goal to use the best combination of tools or is this too simple since the SIRVs only give 69 transcripts variants?
There is no warranty that the best combinations of tools would generalize to other datasets, advantage using the SIRVs vs simulated dataset is punctual, but, in my opinion, important We can further discuss this if you want

Again TYVM for the comments, your feedback is invaluable!

Hi Thiago,

sorry for the extremely slow response and also sorry that I will not cover all of your questions; but I hope I can contribute step-by-step:

Page 2: I like the idea of integrating a transcript scheme in the logo, maybe one can "morph" it inside the baltic sea map? On the other hand, it should not be overcrowded in the end
Can you print the current logo and draw your idea over it and send me a photo? Our idea is to add the CAP and poly(a) tail and maybe a few other exons

Please find my - quick - draft here: Logo_draft.pdf

It was prepared in CorelDraw, so I can export it in almost any format that would suit you, but anyway it is rather an idea, so let me know what you think :)

Page 6: We/you probably need to justify, why those three were chosen in the end for the majority voting. In case of Majiq vs. leafcutter, it is pretty easy since it is "exon-centric" vs. "intron-centric", but for JunctionSeq I find it hard to define what it really does (in easy words, negative binomial model is telling enough?). Does something come to your mind?
I have the following criteria (from the manuscript): To select the supported methods, we used the following criteria: method should use as input read alignment in the BAM format; detect AS splicing as changes on SJ level, not transcript; provide test statistic that compares a control group to a treatment group; output effect size estimates, such as the deltapsi; and ability to detected unannotated SJ.

The criteria seem reasonable! Don't know how many other tools (probably a lot) would fit those criteria. One that I just wanted to bring to your attention (maybe you have seen it already) was PSI-sigma from the Krainer Lab: https://github.com/wososa/PSI-Sigma

I got it to run recently just out of curiosity, because they claimed to also look for NMD status, until I figured out, they just infer NMD status from the GTF annotation file ... pretty disappointing on that part, but other than that it seemed to do a good job. I believe you neither have the time or resources to implement that tool in the baltica workflow as well, but it might be worth to just keep it on a list for the future?

Page 8: At some point you had Whippet on that list as well, how was your experience with that tool?
I briefly tested it, I was ready to add it to Baltica, but there is a major difference: it does it own pseudo-aligment, and so it's harder to compare with other tools that use alignment as input. I saw there maybe a way to bypass the alignment step, but I didn't test it

Actually, I also finally got Whippet to run some time ago and I was massively disappointed how slow it was. I also agree with your points that it might require more testing before Whippet can be properly integrated.

Page 10: "QC as diagnose"
    I find this an important/helpful point, maybe it is worth to provide the user some kind of QC html file that gives an easy overview about the parameters you listed
Page 10: What does junction saturation mean? Would the quantification of junctions not be "open-end", meaning the more, the better? Also, what would the number of reads mapped to introns "tell us"?
SJ saturation is in (RSEQC)[http://dldcc-web.brc.bcm.edu/lilab/liguow/CGI/rseqc/_build/html/#junction-saturation-py]. The idea is that you compare % of reads mapping to annotated SJ to % aligned to novel, you can understand the dataset limit on

find new splicing events.

If one group has increase / decrease % reads mapping to introns may indicate intron retention or accumulation of pre-mRNA, both which are worth of a follow-up

There are probably other uses for QC, but these are two I've experienced.

Thanks for the extra info, never considered that issue before!

Page 21: Many good points! Of course for us personally, the NMD-AS point is very interesting :)

I think I can implement this

This really is not high priority right now for the "public masses", rather if we continue looking for NMD, it might be worth the effort at some point.

Page 22: Again an honest feedback, I did not fully understand the sub-points you made concerning the "incompatibility among DJU results". What I get is that you try to figure out why the single tools differ so much and I fully agree that this is an (academically) very interesting point. Do you have any indications so far, what might be the reason?
The results are incompatible, not the methods. Only 20% of the genes and 2% of the junctions (slide 7). This is very low, for DGE methods we see something in the order of 60% or more

Why the results are different? I can't pinpoint one answer to this, but:

Different strategies to compute a SJ count matrix, for example LeafCutter discards read-alignments with insertions or deletions

Different pre-processing procedures, for example, Majiq correct for GC-content bias

Different test design and statistical models Is that convincing?

I see! Thanks for the clarification

Page 23: You would do the de novo annotation completely without reference (which is probably the best for "de novo")? However, the transcripts then have some cryptic names or would you still try to reconcile in the end, to which reference transcripts the ones you have match the best?
Is it denovo without reference the best denovo? I don't have a opinion about this, but I think having the reference helps with the specificity and sensitivity, if the model trascriptome is well built (human, mouse), but I would have to do some systematic analysis with the SIRVs to be correct

Sorry if it seemed as if I know that "reference-free" de novo is the best; honestly, I simply don't know! I also think having a reference will probably help, I just remember from my StringTie Adventures that implementing the reference again at some point was unfortunately not that super straightforward as I had hoped.

Did you check how the SIRV Benchmark (Page 20) gets "better" with using Baltica? For the left plot (Recall vs. 1-specificity) leafcutter seems to be already almost perfect, only a small recall deficit (which might get better with Baltica?)
Christoph also suggested that, I will do. I am very confident I wont make the recall better, but maybe the Specificity
Along the same line, wouldn't it be the "ultimative" goal to use the best combination of tools or is this too simple since the SIRVs only give 69 transcripts variants?
There is no warranty that the best combinations of tools would generalize to other datasets, advantage using the SIRVs vs simulated dataset is punctual, but, in my opinion, important

We can further discuss this if you want

As we discussed at some other point, probably it really depends on how Baltica performs, until you/we can draw any conclusions. If you want/like keep me updated on the progress :)

I will start this week to try to "install" baltica, I will let you know which questions or problems arise.

dieterich-lab / Baltica

Feedback to 1st presentation #5