Closed kimrutherford closed 2 weeks ago
In all other cases the C-term/N-term is after a comma in the qualifier: qualifier=SPAC19B12.13,C-term
I've just noticed that some have an underscore and some have a "-".
There are some that are missing the gene name in the qualifier:
SPAC13F5.04c │ VTA1 │ cerevisiae │ N_term
SPBC3H7.11 │ ABP140 │ cerevisiae │ N_term
SPAC22E12.10c │ COX15 │ sapiens │ N-term
SPBC21C3.07c │ ABP140 │ cerevisiae │ C_term
"HGMP" looks like a typo here:
SPBC405.01 │ GART │ sapiens │ HGMP:GART,N-term
SPCC569.08c │ GART │ sapiens │ HGMP:GART,C-term
For completeness, here are all the orthologs and qualifiers from the contig files:
pombe | other | species | qualifier |
---|---|---|---|
SPBP4H10.15 | ACO2 | cerevisiae | SPBP4H10.15,N-term |
SPBP4H10.15 | MRPL49 | cerevisiae | SPBP4H10.15,C-term |
SPBC530.12c | CAX4 | cerevisiae | SPBC530.12c,C-term |
SPBC16A3.11 | RAD30 | cerevisiae | SPBC16A3.11,N-term |
SPBC16A3.11 | ECO1 | cerevisiae | SPBC16A3.11,C-term |
SPAC806.02c | CFD1 | cerevisiae | SPAC806.02c,N-term |
SPAC806.02c | CIA1 | cerevisiae | SPAC806.02c,C-term |
SPAC6F12.05c | YJR142W | cerevisiae | SPAC6F12.05c,N-term |
SPAC6F12.05c | THI80 | cerevisiae | SPAC6F12.05c,C-term |
SPAC2C4.12c | TPT1 | cerevisiae | SPAC2C4.12c,N-term |
SPAC2C4.12c | YAE1 | cerevisiae | SPAC2C4.12c,C-term |
SPAC22E12.10c | COX15 | cerevisiae | SPAC22E12.10c,N-term |
SPAC22E12.10c | YAH1 | cerevisiae | SPAC22E12.10c,C-term |
SPAC22A12.08c | YKR070W | cerevisiae | SPAC22A12.08c,N_term |
SPAC22A12.08c | CRD1 | cerevisiae | SPAC22A12.08c,C_term |
SPAC19B12.13 | RSM22 | cerevisiae | SPAC19B12.13,N-term |
SPAC19B12.13 | COX11 | cerevisiae | SPAC19B12.13,C-term |
SPAC15E1.04 | CDC21 | cerevisiae | SPAC15E1.04,C-term |
SPAC1420.04c | RSM22 | cerevisiae | SPAC1420.04c,N-term |
SPAC1420.04c | COX11 | cerevisiae | SPAC1420.04c,C-term |
SPAC13F5.04c | VTA1 | cerevisiae | N_term |
SPBC3H7.11 | ABP140 | cerevisiae | N_term |
SPBC21C3.07c | ABP140 | cerevisiae | C_term |
SPBC530.12c | PPT1 | sapiens | SPBC530.12c,N-term |
SPAC806.02c | NUBP2 | sapiens | SPAC806.02c,N-term |
SPAC806.02c | CIAO1 | sapiens | SPAC806.02c,C-term |
SPAC22E12.10c | COX15 | sapiens | N-term |
SPBC405.01 | GART | sapiens | HGMP:GART,N-term |
SPCC569.08c | GART | sapiens | HGMP:GART,C-term |
from https://github.com/pombase/curation/issues/3455
At the same time, some of the fusions do not have their human orthologs. Do these too
Annotate human orthologs similarly to S. cerevisiaie (N-ter, C-Term)
Systematic ID Gene name Product description SPAC22A12.08c crd1 cardiolipin synthase/ hydrolase fusion protein Crd1 SPAC806.02c CIA machinery CIA1/CFD1 fusion protein SPAC1420.04c cox1101 cytochrome c oxidase assembly protein Cox1101/ mitochondrial ribosomal protein Rsm22 fusion protein SPAC19B12.13 cox1102 cytochrome c oxidase assembly protein Cox1102/ mitochondrial ribosomal protein Rsm2202, fusion protein SPCC1223.08c dfr1 dihydrofolate reductase/ lysophospholipase fusion protein Dfr1 SPAC22E12.10c etp1 mitochondrial [2Fe-2S] cluster assembly ferredoxin Etp1/ cytochrome oxidase cofactor Cox15, fusion protein SPAC1782.04 cox24 mitochondrial mRNA processing protein Cox24/Pet20 SPBP4H10.15 aco2 mitochondrial ribosomal protein subunit L21/aconitate hydratase, fusion protein SPBC2D10.09 snr1 mitochondrial ribosomal protein subunit S47/3-hydroxyisobutyryl-CoA hydrolase Snr1 SPBC16A3.11 eso1 mitotic cohesin N-acetyltransferase/DNA polymerase eta Eso1 fusion protein SPBC530.12c pdf1 palmitoyl protein thioesterase/ dolichol pyrophosphate phosphatase fusion protein Pdf1 SPCC1450.15 pig-F/3-ketosphinganine reductase fusion protein SPAC6F12.05c tnr3 thiamine diphosphokinase Tnr3/ Nudix hydrolase fusion protein SPAC15E1.04 hal3 thymidylate synthase / phosphopantothenoylcysteine decarboxylase / protein phosphatase inhibitor moonlighting protein Hal3 SPBC13E7.02 cwf24 ubiquitin-protein ligase E3/GCN5-related N acetyltransferase fusion protein SPCC1442.07c wss2 ubiquitin/metalloprotease fusion protein Udp7
METTL17 for cox11 https://www.biorxiv.org/content/10.1101/2022.11.24.517765v1
reviewed and added missing human to match the S. cerevisiae, and a couple of small fixes to the cerevisiae ones
SPAC1420.04c COX11 SPAC19B12.13,C-term 2024-08-29
SPAC1420.04c METTL17 SPAC19B12.13,N-term 2024-08-29
SPAC19B12.13 COX11 SPAC19B12.13,C-term 2024-08-29
SPAC19B12.13 METTL17 SPAC19B12.13,N-term 2024-08-29
SPAC22A12.08c HDHD5 SPAC22A12.08c,N-term 2024-08-29
SPAC22A12.08c CRLS1 SPAC22A12.08c,C-term 2024-08-29
SPAC22E12.10c FDX2 SPAC22E12.10c,C-term 2024-08-29
SPAC2C4.12c TRPT1 SPAC2C4.12c,N-term 2024-08-29
SPAC2C4.12c YEA1 SPAC2C4.12c,C-term 2024-08-29
SPAC6F12.05c TPK1 SPAC6F12.05c,C-term 2024-08-29
SPAC6F12.05c NUDT19 SPAC6F12.05c,N-term 2024-08-29
SPBC16A3.11 ESCO1 SPBC16A3.11,C-term 2024-08-29
SPBC16A3.11 POLH SPBC16A3.11,N-term 2024-08-29
SPBC530.12c DOLPP1 SPBC530.12c,C-term 2024-08-29
SPBP4H10.15 ACO2 SPBP4H10.15,C-term 2024-08-29
So happy to do this it's been on my mind since before you went back to New Zealand. I think I'll treat my self to a glass of wine for this one......much easier in the flat file!
Hi Val.
I've found that there are three ways that fusions are annotated. For implementing pombase/pombase-chado#993 it would be helpful to standardise.
SPAC1782.04/cox24 is like this with "(C-term)" / "(N-term)" after the gene ID:
these aren't displayed on the gene pages at the moment.
his2 and his7 has the C-term/N-term in brackets in the qualifier:
his2 / SPBC1711.13:
his7 / SPBC29A3.02c:
In all other cases the C-term/N-term is after a comma in the qualifier: