Open nathandunn opened 9 years ago
@monicacecilia Please comment and then assign to me with recommendations when you are testing.
It's all coming back.
Desktop Apollo had a function that allowed curators to shift the frame of translation +1 or -1 from the base pair where the cursor stood.
This is what it looked like:
In some organisms, cells naturally shift the frame of translation to express a gene (the ribosome skips, basically). This was common in some Drosophila genes and the request was made way back when. For an example see http://www.ncbi.nlm.nih.gov/pmc/articles/PMC108870/
This code should be re-implemented, but this is not of the highest priority at this moment. I'm punting this down to the time after coordinate transformation and variant annotation are implemented and working as desired.
:+1:
@selewis & @nathandunn: It will be very useful to come back to this ticket and work the implementation of this functionality in the near future.
Very common in phages, but sometimes the frameshifts are more than just ±1, e.g. http://www.sciencedirect.com/science/article/pii/S1097276504005398
Incredibly important to CPT's use case I believe. cc @moffmade
The lack of support is a bit of a complex issue, since JBrowse will not render to-spec gff3 including frameshifts. xref https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md you'll have to ctrl-f for "programmed frameshift".
What is the status of resolving this issue?
I think we've deferred due to our time constraints. However, if this is something you'd be interested in implementing, we'd be more than happy to work with you on it. Also, we are doing a hackathon in January if that would be convenient.
I'm asking based on the class that @erasche was referring to, which we will start teaching again in January. @moffmade is now working with us on continuing Eric's work, and the timing is bad for us to attend. But I just sent him the link to look at the agenda. We have an even more critical Apollo problem that he will add an issue here for soon.
@moffmade is welcome to join us remotely, as well, but that will be busy time for teaching. Yeah, let us know about the critical problems and your timeline for teaching. Our hope is that we can possibly get @moffmade doing a few of these fixes himself after getting somewhat familiar with the stack, if he has time.
On Dec 21, 2017, at 11:22 AM, Jim Hu notifications@github.com wrote:
I'm asking based on the class that @erasche https://github.com/erasche was referring to, which we will start teaching again in January. @moffmad is now working with us on continuing Eric's work, and the timing is bad for us to attend. But I just sent him the link to look at the agenda. We have an even more critical Apollo problem that he will add an issue here for soon.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GMOD/Apollo/issues/483#issuecomment-353435648, or mute the thread https://github.com/notifications/unsubscribe-auth/AAt2qjub4Uhbmq1Qh2AbODVU4Dx868GCks5tCq_ugaJpZM4FaLoa.
oops. Updated my reply above for Corey's correct id.
@jimhu-tamu / @MoffMade , @erasche assessment is probably correct. I'll be available to do a remote call on the 4th if its something you might be interested in pursuing. However, I would estimate 2-4 weeks even with our help if I remember this issue correctly.
Maybe @erasche can make some introductions off-line. We can make arrangements over the break (and am happy to point folks to resources).
Offline introduction? I'm physically unavailable until february (holiday.)
Sorry i meant off of GitHub via email. No need for travel! I’ll wait until we see you at the galaxy conference to see you in person.
Nathan
On Dec 22, 2017, at 6:44 AM, Eric Rasche notifications@github.com wrote:
Offline introduction? I'm physically unavailable until february (holiday.)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Back at work, sure, available on the 4th if you need a videoconf or something for more detailed explanation.
From notes:
Treat similarly to a read through stop codon, but base specific
Per discussions with TAMU group @meiliucpt will add some export examples.
Using NCBI: https://www.ncbi.nlm.nih.gov/nuccore/1428093527 as an example (GenBank: MH321492.1: /locus_tag="Lorac_015" is the frameshifted protein, and /locus_tag="Lorac_014" reads through the slippery sequence to the ORF's normal stop codon),
GenBank record for the frameshift protein and its non-shifted version should look like this:
The converted gff3 (converted using our GenBank - GFF3 converter which is from BioPerl) looks like this:
The frameshifted and "normal" reading frames are represented as 2 separate genes.
In the frameshifted feature (Lorac_15), the GFF3 has the gene (Shine-Dalgarno + CDS) as parent, with the mRNA (1st base of CDS to last base of CDS) and Shine-Dalgarno as children. Under the mRNA are 2 CDS and 2 exon features. We're not sure how the frameshift is represented, are the 2 CDSs or 2 exons automatically merged into a single protein sequence when read?
Based on what we see, it looks like we've been representing frameshifts as basically 2 exons which then get merged (i.e., like 2 exons separated by an intron that is -1 bp in length), which is derived from how these are represented in GenBank. If we switched to representing these as an mRNA with a frameshift in it, that could be done but would be a departure from the current process and we'd need to make sure we had a way to place these features in Apollo and export them again in a way that GenBank can handle. I hope this explanation makes sense. Let me know if you have questions.
====
(mostly . . . allow exons to overlap), just for specific isoforms
as @cmdcolin noted, we have a bunch of code for frameshifts and they are in the code, but we do not appear to actually be able to add them.
Something for us to discuss at some point.
===
Output both annotations (original and pre-frameshifted).