japonicusdb / japonicus-curation

Data files for JaponicusDB
0 stars 1 forks source link

S. japonicus new genes #25

Open ValWood opened 3 years ago

ValWood commented 3 years ago

ID range used so far is SJAG_00004 - SJAG_06643 and SJAG_16452-16460 so for new genes I will use

SJAG_07000 - onwards

ValWood commented 3 years ago

@snezhkaoliferenko if you know of any missing genes need annotating. I think there will be quite a lot. Add to this ticket.

ValWood commented 3 years ago

As per:https://github.com/japonicusdb/japonicus-curation/issues/29

committed in [main 071a1eb]

snezhkaoliferenko commented 3 years ago

SJAG_07001 does not come up in JaponicusDB?

kimrutherford commented 3 years ago

I only made it today, it will show up the next time Kim reloads...

I'll do a new load tonight (UK time).

snezhkaoliferenko commented 3 years ago

thanks! @kimrutherford btw Canto has been offline for a couple days - you probably know about it?

ValWood commented 3 years ago
Screenshot 2021-07-26 at 19 23 04

[main 158a074] fix SJAG_01701 replaced by ~SJAG_07003~ SJAG_07002

kimrutherford commented 3 years ago

btw Canto has been offline for a couple days - you probably know about it?

Sorry about that! Cambridge had a power cut a few days ago and japonicus Canto didn't restart. I hadn't set it up properly. I've fixed that now and Canto is back.

kimrutherford commented 3 years ago

SJAG_07001 does not come up in JaponicusDB?

The site has been updated so it's there now: http://japonicusdb.kmr.nz/gene/SJAG_07001

ValWood commented 3 years ago

https://github.com/japonicusdb/japonicus-curation/issues/32

ValWood commented 3 years ago

I think the existing SJAG_02438 is really just a translation of a bit to 5'UTR so I will flag this as dubious.

Screenshot 2021-07-29 at 15 37 54
ValWood commented 3 years ago

Very pleased with this one - I feel as though I am reviving some ancient craft!

SJAG_06617 looked odd. It turns out it was:

Screenshot 2021-07-29 at 16 18 55

I won't even keep SJAG_06617 as dubious, it would be confusing, largely overlapping in the wrong frame (no shared amino acids) so it will be deleted.

and viola:

Screenshot 2021-07-29 at 16 16 53
ValWood commented 3 years ago

[main f43a747] gene stucture updates 3 files changed, 10 insertions(+), 63 deletions(-)

ValWood commented 3 years ago

This may not be quite right, but it is the correct length, the final exon s definitely correct, and it now hits qxr10 where the previous structure did not.

Screenshot 2021-07-29 at 18 05 23 Screenshot 2021-07-29 at 18 07 14

[main 9142c84] gene stucture updates and removals

ValWood commented 3 years ago
Screenshot 2021-08-01 at 12 08 47 Screenshot 2021-08-01 at 12 16 49 Screenshot 2021-08-01 at 12 11 26
ValWood commented 3 years ago
ValWood commented 3 years ago

created from N-term of /systematic_id="SJAG_03763"

@snezhkaoliferenko might be of interest to you. This is ER membrane integral protein, implicated in sterol metabolism SPAC56F8.07 gene merge in japonicus.

ValWood commented 3 years ago
ValWood commented 3 years ago

The trickiest one so far

Screenshot 2021-08-01 at 21 20 24 Screenshot 2021-08-01 at 21 20 47
ValWood commented 3 years ago

[main 1f94d6f] gene structure updates committed

ValWood commented 3 years ago

Once I figured out where this was, the intron was already sitting there waiting for me:

Screenshot 2021-08-03 at 14 41 45

60S ribosomal protein L41 (diddy)

[main ea27e8f] new genes ->7016

snezhkaoliferenko commented 3 years ago

Very pleased with this one - I feel as though I am reviving some ancient craft!

SJAG_06617 looked odd. It turns out it was:

* [x]  SJAG_07005,  new gene in magenta.
Screenshot 2021-07-29 at 16 18 55

I won't even keep SJAG_06617 as dubious, it would be confusing, largely overlapping in the wrong frame (no shared amino acids) so it will be deleted.

and viola:

Screenshot 2021-07-29 at 16 16 53

this is amazeballs also because it gives us insight into japonicus mitochondrial metabolism, ruling out some possibilities. japonicus does not respire and it did lose a number of 'mitochondrial-related' genes

snezhkaoliferenko commented 3 years ago
* [x]  New gene SJAG_07006 (qcr10)   replaces  deleted SJAG_03830

This may not be quite right, but it is the correct length, the final exon s definitely correct, and it now hits qxr10 where the previous structure did not.

Screenshot 2021-07-29 at 18 05 23 Screenshot 2021-07-29 at 18 07 14

[main 9142c84] gene stucture updates and removals

ditto this, i thought it was absent

snezhkaoliferenko commented 3 years ago
* SJAG_07010

@ValWood thanks! interaction partners in pombe kick ass.

ValWood commented 3 years ago

interaction partners in pombe kick ass.

wow, yes they are!

My rule of thumb, if something is present in human, pombe and cerevisiae it will be present in japonicus, almost certainly. There are gene losses in S. c (mainly splicing and heterochromatin related), and gene losses in pombe (mainly metabolic, peroxismal, fatty acid metabolism). But- if genes are present in pombe, cerevisiae and human they will usually present in every other eukaryote.

So if you think you know of any small things that appear to be missing let me know. At present I'm looking at genes usually conserved 1:1 and present in human, pombe, cerevisiae.

Mitochondrial proteins I haven't yet been able to find, but have looked for: cox8, hot13, mrx11, cmc4, img2, mitochondrial ribosomal protein subunit L9, rrg9 I still think they are lurking somewhere. It's really tricky to find small, highly spliced, or disordered proteins. I think some of these are all 3....need more strategies!

ValWood commented 3 years ago

I guess we can say that JaponicusDB is already useful! This is one of the points I am trying to make in the manuscript. You can't do an effective comparative analysis of processes and pathways with missing genes! These real world examples will be good for the conclusion ;)

ValWood commented 3 years ago

Checking up to date, eventually everything in this ticket is showing up correctly!

snezhkaoliferenko commented 3 years ago

@ValWood

OK, these are what I think are present in pombe but missing in japonicus:

SPBPB2B2.02 SPBC1105.04c SPBPB2B2.09c SPAC3H8.03 SPAC22G7.07c SPAC4G8.11c SPAC20G8.04c SPAC105.01c SPAC105.03c SPAC1002.01 SPAC1002.14 SPAC18G6.01c SPAC6C3.02c SPAC6B12.06c SPAC4F8.10c SPAC1805.02c SPAC513.05 SPAC2E1P3.01 SPAC31G5.06 SPAC24C9.16c SPAC6G9.03c SPAC3A11.06 SPAC4H3.08 SPAC25B8.11 SPAC25B8.18 SPAC27D7.04 SPAC27D7.06 SPAC20G4.05c SPAC29B12.12 SPBPB21E7.07 SPBC1198.14c SPBC337.04 SPBC409.16c SPBC691.01 SPBC3H7.06c SPBC29A10.11c SPBC19C7.11 SPBC776.06c SPBC3B8.06 SPBC1347.13c SPBC1652.01 SPCC4B3.06c SPCC132.04c SPCC1281.07c SPCC1223.03c SPAC513.07 SPAC11G7.03 SPBC902.05c


From the point of view of metabolism, Fbp1, Gut2 and the two subunits of isocitrate dehydrogenase (Idh1/2) are fascinating.

ValWood commented 3 years ago

Right, I think you are right about Fbp1 and Gut2. These would be difficult to miss, due to the conservation beyond eukaryotes and the large size. And the fact that both subunits are missing. I would not look for these. Etf 1&2 idh1&2 are interesting too, again I would not look for these for the reasons above.

Some of the other differences are unsurprising, because they are multi gene families which have various species specific duplications and losses. Most of these probably have some partially redundant paralog

I will continue to look for cox8, hot13, mix17, mrx11, atp10, saw1 and a few of the other diddy ones. It is possible that some of these are required only for the expression or assembly for the missing metabolic pathway though and so they really are lost....... Cox8 I am sure IS there, but 66 AA and 4 exons it's challenging! I will continue to try to locate.......after lunch...

ValWood commented 3 years ago

mzt1 mitotic spindle organizing protein Mzt1 chr2 980849..981070 /systematic_id="SJAG_07018"

I thought I had done this! must have been confusing with something else...