japonicusdb / japonicus-curation

Data files for JaponicusDB
0 stars 1 forks source link

S. japonicus gene coordinate updates #16

Open ValWood opened 3 years ago

ValWood commented 3 years ago

old complement(join(1427768..1429375,1429526..1429801,1429905..1429964)) new

ValWood commented 3 years ago

This looks like a gene fusion. http://japonicusdb.kmr.nz/gene/SJAG_01119

Screenshot 2021-07-07 at 17 22 20

13-08-2021 I'm not so sure. about this, I think the Ctr domain could be a false positive. Leaving for now.

snezhkaoliferenko commented 3 years ago

I haven't done this one yet. I'm not keen to start altering the sequence right now if there will be a new assembly. It might be easier to just map everything over from the original sequence. I could model the frameshifts here, but I think there must be additional errors to make this coding because the discrepancies here would not resolve the frameshifts (there would still be stops). Unless the feature start is also incorrect which would affect the counting. @snezhkaoliferenko Can we look at this together sometime next week?

ValWood commented 3 years ago

SJAG_05309 and SJAG_05310 is spo15, needs merging

ValWood commented 3 years ago

chromosome I

chromosome II. KE651167.contig

beautiful conservation of this new exon:

Screenshot 2021-08-12 at 19 30 38 Screenshot 2021-08-12 at 19 30 01

chromosome III. KE65112.contig

today's edits 12-08-2021 [main bfa6108] gene stucture edits 3 files changed, 24 insertions(+), 253 deletions(-)

ValWood commented 3 years ago

old join(1823068..1823505,1823646..1823669) new ~join(1823068..1824015,1824018..1824464) much longer, no splice at previous point but later frameshift~ scrub that, its a splice site (conserved in pombe) join(1823068..1824014,1824059..1824464)

ValWood commented 3 years ago

old complement(join(1820740..1820814,1820867..1821146,1821210..1821237,1821305..1821506,1821573..1821746,1821924..1822529))

new complement(join(1820740..1820814,1820867..1821146,1821210..1821237,1821305..1821506,1821573..1821782))

[main 1f94d6f] gene structure updates committed

ValWood commented 3 years ago

~mpo1 SJAG_00519 old join(3191090..3191107,3191152..3191218,3191275..3191898,3191959..3192104,3192153..3192416) new join(3191090..3191107,3191152..3191218,3191275..3191915) C-term exons repurposed to gna1 glucosamine-phosphate N-acetyltransferase~

SJAG_00519, backtracking. ALso appears to be fused in. Octosporus. So here, I will over-ride the product with

ER membrane fatty acid alpha-oxygenase Mpo1/glucosamine-phosphate N-acetyltransferase Gna1, tandem fusion

ValWood commented 3 years ago
ValWood commented 3 years ago

trim to met

ValWood commented 3 years ago
ValWood commented 3 years ago

old complement(3174139..3174510) ~new complement(join(3174139..3174469,3174652..3174683)) updated, also running frameshifted version~ framshifted version eventually lead me to: complement(join(3174139..3174240,3174316..3174439,3174485..3174575,3174668..3174683)) See below for details.

Screenshot 2021-08-12 at 17 54 30 Screenshot 2021-08-12 at 17 59 04
ValWood commented 3 years ago

I almost abandoned this one. Big intron:

Screenshot 2021-08-13 at 12 25 31 Screenshot 2021-08-13 at 12 17 36 Screenshot 2021-08-13 at 12 23 19

==

In smaller contigs. I'm not going to address these as they are often at the end of the contigs. I checked snf5 as it would be useful to get the Met in that but it's still missing.

SJAG_01111 snf5 SWI/SNF complex subunit Snf5 supercont5.31 SJAG_06639 zinc finger, CCHC-type supercont5.5 SJAG_06643 supercont5.26 SJAG_06638 supercont5.5 SJAG_05161 chromo/chromo shadow domain family supercont5.4 SJAG_06641 chromo/chromo shadow domain family supercont5.18

Still to do

There is only one upstream Methionine. I can make an intron but it doesn't look like a good into. It is possible this has some sequence error. I would like to see if the methionine can be confirmed before I add it. ASK JUAN TO CHECK.

snezhkaoliferenko commented 3 years ago

Val!! R-e-s-p-e-c-t


From: Val Wood @.> Sent: 12 August 2021 17:57 To: japonicusdb/japonicus-curation @.> Cc: Oliferenko, Snezhana @.>; Comment @.> Subject: Re: [japonicusdb/japonicus-curation] S. japonicus gene coordinate updates (#16)

pfd2. This one was a bastard- took about 1.5 hours!

[Screenshot 2021-08-12 at 17 54 30]https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F7359272%2F129237392-6d2e8b2a-6dde-4bb5-ba7c-ed2a8ca2edbf.png&data=04%7C01%7Csnezhana.oliferenko%40kcl.ac.uk%7Ccefb43809be148c04cec08d95db247c7%7C8370cf1416f34c16b83c724071654356%7C0%7C0%7C637643842567440260%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=qu8T5bYa9W0y4pe%2F1xEhujUklPaOigq4L2IN6nbj%2FYM%3D&reserved=0

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjaponicusdb%2Fjaponicus-curation%2Fissues%2F16%23issuecomment-897804916&data=04%7C01%7Csnezhana.oliferenko%40kcl.ac.uk%7Ccefb43809be148c04cec08d95db247c7%7C8370cf1416f34c16b83c724071654356%7C0%7C0%7C637643842567450258%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=gUZUlNoNRODOYze1RWe51w%2F80YQwjnUgwECYig9oUmk%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAUYNMT3OAYTDSVGZKQEL2HTT4P4P3ANCNFSM4767C5HQ&data=04%7C01%7Csnezhana.oliferenko%40kcl.ac.uk%7Ccefb43809be148c04cec08d95db247c7%7C8370cf1416f34c16b83c724071654356%7C0%7C0%7C637643842567460251%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=PXaYP%2FkL8sQud9TndOm3kCPL0uEn2N37thnBlwMrZ5k%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7Csnezhana.oliferenko%40kcl.ac.uk%7Ccefb43809be148c04cec08d95db247c7%7C8370cf1416f34c16b83c724071654356%7C0%7C0%7C637643842567460251%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aqZWHnYxEimt8%2BYgyq6oOHR0hHwtXScfq4EtrdfFeEg%3D&reserved=0 or Androidhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26utm_campaign%3Dnotification-email&data=04%7C01%7Csnezhana.oliferenko%40kcl.ac.uk%7Ccefb43809be148c04cec08d95db247c7%7C8370cf1416f34c16b83c724071654356%7C0%7C0%7C637643842567460251%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=z%2BeWbWtqKcFDWA8%2FbPS0GtHoUYNevLn1J1DbX64HFN0%3D&reserved=0.

ValWood commented 3 years ago

[main d977fb1] final structure updates round1 3 files changed, 14 insertions(+), 186 deletions(-)

The ones I can do are now all committed. @juanmatacambridge I have flagged ~6~ 7 in this ticket where it would be useful to see if you can see riboseq signal @snezhkaoliferenko I am struggling with FAS beta. Can we take a look at that next week?

snezhkaoliferenko commented 3 years ago

@ValWood sure thank you

juanmatacambridge commented 3 years ago

Hi, I’m going away to Spain and won’t have time to do them all before going. Just one… 03764 is interesting: single translated ORF from 1,823,068 to 1,824,156. I can see the annotated intron spliced out in only two reads, but dozens of reads map to the intron – so it should be annotated as single CDS without the intron. Then translation between 1,824,156 until 1,824,464. The two genes just overlap – there must be a mutation that changes the frame between the two ORFs Quite weird structure – how does it look in other Schizos? Best Juan

ValWood commented 3 years ago

because sometimes the minor isoform is the functional isoform (apn1 in pombe).... or it could be a retained intron, anyway the product makes it clear that this needs investigating further, and I will add "warnings" once we have these available.

[main 9ed759c] alg10 1 file changed, 1 insertion(+)

ValWood commented 3 years ago

tam13 is strange, see https://github.com/pombase/curation/issues/3069

juanmatacambridge commented 3 years ago

[1] Mug147 and tam13 are clearly distinct in S. pombe.

[2] SJAG_00388 is very weird. There is no evidence of translation on the annotated strand, but there is very strong translation in the complement.

The coordinates of the translated gene are complement 3,445,140 to 3,445,242. There must be a sequencing errors or an unannotated intron, as the frame changes half way through the protein.

It’s a microprotein, but the signal is very strong and clear. I can’t find it in pombe by blast!

Juan

PS. If you want to have the ribosome profiling of japonicus let me know, it can be very useful (or puzzling)…

ValWood commented 3 years ago

[2] above OK thanks, I have annotated SJAG_00388 as "dubious, no evidence for translation on annotated strand" I won't annotate the frameshifted microprotein as it is only 33 AA and likely a uORF for SJAG_00389 I left this as a misc_feature for future reference

ValWood commented 3 years ago

@juanmatacambridge don't worry about the others. We have all that we are doing for the paper, and have made good improvements. The rest is a 'work in progress' and can be handled by future work from the japonicus community.