Closed jfass closed 2 years ago
The only thing I'm seeing is the following warnings in stderr:
/home/ubuntu/Spladder/spladder-venv/lib/python3.5/site-packages/numpy/core/fromnumeric.py:2957: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
/home/ubuntu/Spladder/spladder-venv/lib/python3.5/site-packages/numpy/core/_methods.py:80: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
... but, there are plenty of other events being generated. Just not in the gene I messed with.
Hi Joe,
Thanks for reporting. Would you mind sharing the test.bam
with me?
Best,
Andre
Hi Andre,
Here's a link to the data (two bams from samples in GEO, referenced in the SpliceDetector paper, so they're public, and the GTF modified by me to have an extra exon -- search for "Joe" in the gtf file). Let me know if you have any problems grabbing it; I think I set up the S3 bucket for full public access, but haven't tested that.
Many thanks for taking a look, ~Joe
Edit: heh ... forgot to make it public; also fixed the link
Hi Joe,
I finally was able to look at the data you sent. It is my understanding that you only altered the input gtf file. So the bam file(s) do not contain any evidence for the exon you added? In this case, SplAdder will not report an event, as by default only events are reported that have confirmation in at least one of the given RNA-Seq samples. If you would like to see the coordinates of all events that could be extracted from the graph, you can force SplAdder to output those as well (using --output-gff3
). This file indeed contains your fake event:
chr21 exon_skip gene 26976108 26978967 . - . ID=exon_skip.329;GeneName="ENSG00000154719.13_5"
chr21 exon_skip mRNA 26976108 26978967 . - . ID=exon_skip.329_iso1;Parent=exon_skip.329;GeneName="ENSG00000154719.13_5"
chr21 exon_skip exon 26976108 26976247 . - . Parent=exon_skip.329_iso1
chr21 exon_skip exon 26978761 26978967 . - . Parent=exon_skip.329_iso1
chr21 exon_skip mRNA 26976108 26978967 . - . ID=exon_skip.329_iso2;Parent=exon_skip.329;GeneName="ENSG00000154719.13_5"
chr21 exon_skip exon 26976108 26976247 . - . Parent=exon_skip.329_iso2
chr21 exon_skip exon 26977250 26977447 . - . Parent=exon_skip.329_iso2
chr21 exon_skip exon 26978761 26978967 . - . Parent=exon_skip.329_iso2
If you need it, I can add an option, to also go on and quantify all events (and not only the ones confirmed in at least on input sample).
Cheers, Andre
Hi Andre,
In that test case, since I added a fake exon, every read that splices across the two real flanking exons should be evidence for the exon skip, no?
~Joe
On Fri, Aug 9, 2019, 4:33 AM Andre Kahles notifications@github.com wrote:
Hi Joe, I finally was able to look at the data you sent. It is my understanding that you only altered the input gtf file. So the bam file(s) do not contain any evidence for the exon you added? In this case, SplAdder will not report an event, as by default only events are reported that have confirmation in at least one of the given RNA-Seq samples. If you would like to see the coordinates of all events that could be extracted from the graph, you can force SplAdder to output those as well (using --output-gff3). This file indeed contains your fake event:
chr21 exon_skip gene 26976108 26978967 . - . ID=exon_skip.329;GeneName="ENSG00000154719.13_5" chr21 exon_skip mRNA 26976108 26978967 . - . ID=exon_skip.329_iso1;Parent=exon_skip.329;GeneName="ENSG00000154719.13_5" chr21 exon_skip exon 26976108 26976247 . - . Parent=exon_skip.329_iso1 chr21 exon_skip exon 26978761 26978967 . - . Parent=exon_skip.329_iso1 chr21 exon_skip mRNA 26976108 26978967 . - . ID=exon_skip.329_iso2;Parent=exon_skip.329;GeneName="ENSG00000154719.13_5" chr21 exon_skip exon 26976108 26976247 . - . Parent=exon_skip.329_iso2 chr21 exon_skip exon 26977250 26977447 . - . Parent=exon_skip.329_iso2 chr21 exon_skip exon 26978761 26978967 . - . Parent=exon_skip.329_iso2
If you need it, I can add an option, to also go on and quantify all events (and not only the ones confirmed in at least on input sample).
Cheers, Andre
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ratschlab/spladder/issues/89?email_source=notifications&email_token=AAEILKFGTY3FVBSRHULVLR3QDVI2NA5CNFSM4H2US372YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD36NHVI#issuecomment-519885781, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEILKCNPQI33WJYYJTHHN3QDVI2NANCNFSM4H2US37Q .
Sure, but an event will only be reported if both paths in the event happen in at least one of the input bam files. There is no evidence for the inclusion of the exon in the bam files and hence no event is reported.
Cheers, Andre
Ah, so I assumed that SplAdder was building from- and then comparing to the canonical (GTF/GFF) annotation. But I can extract those events from the graph with that --output-gff3 option? What would be the new option, then? Does --output-gff3 output the entire graph, and so I would need to parse out the non-canonical lines (which is easy if they all have the appropriate event name in column 2)? So, if you added a new option, it would omit all of the graph paths that exist in the input GTF/GFF?
So this solves this artificial test case. Will you have time to look at the more realistic case I emailed you about -- the EGFRvIII one?
Thanks Andre; I appreciate the time you've put into this, ~Joe
On Fri, Aug 9, 2019 at 7:42 AM Andre Kahles notifications@github.com wrote:
Sure, but an event will only be reported if both paths in the event happen in at least one of the input bam files. There is no evidence for the inclusion of the exon in the bam files and hence no event is reported.
Cheers, Andre
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ratschlab/spladder/issues/89?email_source=notifications&email_token=AAEILKCCMANKOKBVKO6VXBLQDV635A5CNFSM4H2US372YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD36327Q#issuecomment-519945598, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEILKE2K2QU6P2BJOHNZCLQDV635ANCNFSM4H2US37Q .
Yes, the option --output-gff3
will report all events that are found in the graph. However, the events will not get quantified (as in our use-cases so fare we never needed events that had only evidence for one of the 2 possible paths in all samples). I could think of a use case though, where you would count annotated paths as evidence and then quantify the events nevertheless. So what I was suggesting to also allow for quantification of events "unconfirmed" by the read data.
Regarding your other question, I will reply via e-mail.
Best, Andre
Gotcha -- yes! that seems like it would be useful to us. ~Joe
On Fri, Aug 9, 2019 at 9:32 AM Andre Kahles notifications@github.com wrote:
Yes, the option --output-gff3 will report all events that are found in the graph. However, the events will not get quantified (as in our use-cases so fare we never needed events that had only evidence for one of the 2 possible paths in all samples). I could think of a use case though, where you would count annotated paths as evidence and then quantify the events nevertheless. So what I was suggesting to also allow for quantification of events "unconfirmed" by the read data.
Regarding your other question, I will reply via e-mail.
Best, Andre
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ratschlab/spladder/issues/89?email_source=notifications&email_token=AAEILKHMS3PL6XJMGGMLQXLQDWLZTA5CNFSM4H2US372YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD37FAFQ#issuecomment-519983126, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEILKHUHQAVIFRSCWWXMMTQDWLZTANCNFSM4H2US37Q .
Gotcha -- yes! that seems like it would be useful to us. ~Joe …
Hi Joe,
I am confused about how to figure out if an alternative splicing in the spladder output files is an annotated one in user-provided gtf. Do you have any idea about this question? If I cannot directly know which alternative splicing event is already annotated from the spladder output files, what should I do to solve it? Thank you very much! I'm looking forward to your reply~
Best, Minghao
Closing this, as the original issue has been resolved. Also, the labelling whether an event is annotated has now been added to the output in the latest release. Please open a new issue, if anything comes up using the latest release. Best, Andre
Description
I've had trouble finding "ground truth" test cases of clear splice alterations that I can test quickly, so I decided to spoof an exon skip by adding a fake exon in the middle of an existing intron.
What I Did
I created a fake "exon 2" in the first intron of one of the NCAM2 transcripts from chr21 (the second of these three GTF lines) and renumbered subsequent exons:
Then I ran:
... which completed in some number of minutes (since the annotation only details genes on chr21). SplAdder reported a handful of exon skip events on chr21, but not in the NCAM2 gene, even though ~42 reads are spliced across the real 1st exon, skipping the fake exon from above.
Can you tell me if I'm doing anything problematic, here? Shouldn't this generate an exon skip event? What am I missing? I can't provide data for this case, but I could replicate this on a publicly available dataset if you want to have a test case.
Thanks, ~Joe