pombase / curation

PomBase curation
7 stars 0 forks source link

Final paralogs, #3758

Open ValWood opened 2 weeks ago

ValWood commented 2 weeks ago

I have a large list of >277 genes that I still need to check for paralogs

This is quite time confusing but I have a heuristic!

Could you extract the PANTHER families for these? If a panther family has 2 members, they would be paralogs If a panther family has 1 member it can be removed from this list

(that should sort most of them)

kimrutherford commented 2 weeks ago

I have a large list of >277 genes that I still need to check for paralogs

Could you link to the list?

ValWood commented 2 weeks ago

oops!

now down to 265 https://www.pombase.org/results/from/id/311eb4b3-506f-46b6-a460-07384637996a

kimrutherford commented 2 weeks ago

Do you know of a pombe gene ID to Panther iD mapping file?

ValWood commented 2 weeks ago

From the IntePRo file. I was accessing them from the domain section

kimrutherford commented 2 weeks ago

From the IntePRo file.

Thanks.

If a panther family has 2 members, they would be paralogs

Do you mean 2 members that are in your list of 265?

ValWood commented 2 weeks ago

Yep

ValWood commented 2 weeks ago

Although I'm interested if there are members outside of the 265 (but PANTHER sometimes missassigns, although, so do I!)

kimrutherford commented 2 weeks ago

These are the Panther IDs with 2 or more genes in your list. I hope this is what you had in mind:


PTHR24012       SPAC1610.03c    crp79   poly(A) binding protein Crp79   Q9P6M8
PTHR24012       SPAC343.07      mug28   RNA-binding protein Mug28, implicated in mRNA processing        Q9UT83

PTHR31492       SPBC21D10.06c   map4    cell surface adhesion protein for conjugation Map4      O74346
PTHR31492       SPBC1348.08c            cell surface glycoprotein, adhesion molecule    P0CU04
PTHR31492       SPAC977.07c     pfl6    cell surface glycoprotein, flocculin Pfl6       P0CU05
PTHR31492       SPCC188.09c     pfl4    cell surface glycoprotein, flocculin Pfl4       Q7Z9I1
PTHR31492       SPAC1F8.06      pfl8    cell surface glycoprotein, flocculin Pfl8       Q92344
PTHR31492       SPAPB2C8.01             cell surface glycoprotein, adhesion molecule    Q9C0Y2
PTHR31492       SPAP11E10.02c   mam3    cell surface adhesion protein for conjugation Mam3      Q9HDY9

PTHR31001       SPAC1F7.11c             DNA-binding transcription factor, zf-fungal binuclear cluster type      Q09922
PTHR31001       SPAC139.03      toe2    DNA-binding transcription factor, zf-fungal binuclear cluster type Toe2 Q9UTN0

PTHR19304       SPBC29B5.01     atf1    DNA-binding transcription factor, Atf-CREB family Atf1  P52890
PTHR19304       SPBC2F12.09c    atf21   DNA-binding transcription factor, Atf-CREB family Atf21 P78962
PTHR19304       SPAC22F3.02     atf31   DNA-binding transcription factor Atf31  Q09771
PTHR19304       SPAC21E11.03c   pcr1    DNA-binding transcription factor Pcr1   Q09926

PTHR10177       SPBC2G2.09c     crs1    meiosis specific cyclin Crs1    O43008
PTHR10177       SPBC19F5.01c    puc1    G1 cyclin Puc1  P25009

PTHR46910       SPBC530.05      prt1    DNA-binding transcription factor Prt1   O59741
PTHR46910       SPBC530.08      ntu2    DNA-binding transcription factor, membrane-tethered     O59744
PTHR46910       SPBC1773.12             DNA-binding transcription factor, zf-fungal binuclear cluster type      O94569
PTHR46910       SPBC1773.16c            DNA-binding transcription factor, zf-fungal binuclear cluster type      O94573
PTHR46910       SPAC11D3.07c    toe4    DNA-binding transcription factor, zf-fungal binuclear cluster type      Q10086
PTHR46910       SPAPB24D3.01    toe3    DNA-binding transcription factor        Q9C0Z1

PTHR10000       SPBC215.10      odr1    HAD superfamily hydrolase, unknown role O94314
PTHR10000       SPAC25B8.12c            HAD superfamily hydrolase, unknown role Q9UTA6

PTHR18884       SPBC19F8.01c    spn7    meiotic septin Spn7     O60165
PTHR18884       SPAC24C9.15c    spn5    meiotic septin Spn5     P48010

PTHR31162       SPCC794.06              transmembrane transporter       O59815
PTHR31162       SPAPB8E5.03     mae1    plasma membrane malate/succinate:proton symporter Mae1  P50537

PTHR36206       SPBC15D4.02     gsf1    DNA-binding transcription factor, zf-fungal binuclear cluster type Gsf1 O74308
PTHR36206       SPAC821.07c     moc3    DNA-binding transcription factor Moc3   Q9UT46

PTHR10270       SPBC1711.02     mat3-Mc mating type M-specific HMG-box DNA-binding transcription factor Mc at silenced MAT3 locus       P0CY16
PTHR10270       SPBC23G7.09     mat1-Mc M-specific trancription factor Mc       P0CY17

PTHR43008       SPAC22A12.17c           short chain dehydrogenase       O13908
PTHR43008       SPCC1739.08c            short chain dehydrogenase       O74470

PTHR23502       SPAC977.04              truncated C terminal region of membrane transporter     G2TRN8
PTHR23502       SPBC609.04      caf5    plasma membrane spermine family transmembrane transporter Caf5  O94528

PTHR43625       SPBC215.11c             aldo/keto reductase, unknown biological role    O94315
PTHR43625       SPAC1F7.12      yak3    aldose reductase ARK13 family YakC, implicated in cellular detoxification from family members   Q09923

PTHR47338       SPBC530.11c             DNA-binding transcription factor, zf-fungal binuclear cluster type      O59746
PTHR47338       SPAC1327.01c            DNA-binding transcription factor, zf-fungal binuclear cluster type      Q1MTM9
PTHR47338       SPAPB1A11.04c   mca1    DNA-binding transcription factor, zf-fungal binuclear cluster type, meiosis-specific copper activator, Mca1     Q9HDX1

DONE ~PTHR31323 SPAC2C4.17c msy2 MS ion channel protein 2 O14050 PTHR31323 SPCC1183.11 msy1 MS calcium ion channel protein Msy1 O74839 PTHR43976 SPCC162.03 short chain dehydrogenase O74628 PTHR43976 SPBC1348.09 short chain dehydrogenase, implicated in cellular detoxification P0CU01 PTHR10357 SPAC27E2.01 alpha-amylase homolog O13996 PTHR10357 SPAC25H1.09 mde5 alpha-amylase homolog Mde5 O14154 PTHR10357 SPBC16A3.13 meu7 alpha-amylase homolog Aah4 O42918 PTHR10357 SPCC757.12 aah1 cell wall alpha-amylase homolog Aah1 O74922 PTHR10357 SPAC23D3.14c aah2 alpha-amylase homolog Aah2 Q09840 PTHR10357 SPCC11E10.09c alpha-amylase homolog Q10427 PTHR10357 SPCC63.02c aah3 cell wall alpha-amylase homolog Aah3 Q9Y7S9 PTHR47990 SPAC25B8.13c isp7 2-OG-Fe(II) oxygenase superfamily protein P40902 PTHR47990 SPCC1494.01 iron/ascorbate oxidoreductase family Q7LL04 PTHR31560 SPAC22H10.08 DUF2009 family protein, conserved in yeast and apicomplexa Q10301 PTHR31560 SPCC16A11.03c DUF2009 family protein, conserved in yeast and apicomplexa Q9USN2~

kimrutherford commented 2 weeks ago

Although I'm interested if there are members outside of the 265 (but PANTHER sometimes missassigns, although, so do I!)

This is the list of Panther IDs and genes from your list where the Panther ID matches more than 1 gene:

PTHR43008       SPAC22A12.17c           short chain dehydrogenase       O13908
PTHR43008       SPCC1739.08c            short chain dehydrogenase       O74470
PTHR31492       SPBC21D10.06c   map4    cell surface adhesion protein for conjugation Map4      O74346
PTHR31492       SPBC1348.08c            cell surface glycoprotein, adhesion molecule    P0CU04
PTHR31492       SPAC977.07c     pfl6    cell surface glycoprotein, flocculin Pfl6       P0CU05
PTHR31492       SPCC188.09c     pfl4    cell surface glycoprotein, flocculin Pfl4       Q7Z9I1
PTHR31492       SPAC1F8.06      pfl8    cell surface glycoprotein, flocculin Pfl8       Q92344
PTHR31492       SPAPB2C8.01             cell surface glycoprotein, adhesion molecule    Q9C0Y2
PTHR31492       SPAP11E10.02c   mam3    cell surface adhesion protein for conjugation Mam3      Q9HDY9
PTHR23502       SPAC977.04              truncated C terminal region of membrane transporter     G2TRN8
PTHR23502       SPBC609.04      caf5    plasma membrane spermine family transmembrane transporter Caf5  O94528
PTHR23092       SPCC663.12      cid12   poly(A) polymerase Cid12        O74518
PTHR12271       SPAC821.04c     cid13   cytoplasmic poly(A) polymerase Cid13    Q9UT49
PTHR47990       SPAC25B8.13c    isp7    2-OG-Fe(II) oxygenase superfamily protein       P40902
PTHR47990       SPCC1494.01             iron/ascorbate oxidoreductase family    Q7LL04
PTHR43173       SPAC14C4.09     agn1    cell wall glucan endo-1,3-alpha-glucosidase Agn1        O13716
PTHR43249       SPBC12C2.04             NAD binding dehydrogenase family protein        Q09745
PTHR11986       SPAC27F1.05c            aminotransferase class-III, unknown specificity Q10174
PTHR48012       SPAC12B10.14c   tea5    pseudokinase Tea5       Q10447
PTHR13710       SPAC212.06c             DNA helicase in rearranged telomeric region, truncated  G2TRN7
PTHR47263       SPBC21C3.20c    git1    C2 domain protein Git1  Q9P7K5
PTHR46517       SPCC1620.13             phosphoglycerate mutase/6-phosphofructo-2-kinase family O94420
PTHR31323       SPAC2C4.17c     msy2    MS ion channel protein 2        O14050
PTHR31323       SPCC1183.11     msy1    MS calcium ion channel protein Msy1     O74839
PTHR43625       SPBC215.11c             aldo/keto reductase, unknown biological role    O94315
PTHR43625       SPAC1F7.12      yak3    aldose reductase ARK13 family YakC, implicated in cellular detoxification from family members   Q09923
PTHR18884       SPBC19F8.01c    spn7    meiotic septin Spn7     O60165
PTHR18884       SPAC24C9.15c    spn5    meiotic septin Spn5     P48010
PTHR10644       SPCC11E10.08    rik1    CLRC ubiquitin ligase complex WD repeat subunit Rik1    Q10426
PTHR47338       SPBC530.11c             DNA-binding transcription factor, zf-fungal binuclear cluster type      O59746
PTHR47338       SPAC1327.01c            DNA-binding transcription factor, zf-fungal binuclear cluster type      Q1MTM9
PTHR47338       SPAPB1A11.04c   mca1    DNA-binding transcription factor, zf-fungal binuclear cluster type, meiosis-specific copper activator, Mca1     Q9HDX1
PTHR14430       SPCC1183.12     spo13   sporulation specific RabGEF Spo13       C6Y4C9
PTHR12304       SPBC800.11              inosine-uridine preferring nucleoside hydrolase Q9HGL1
PTHR24346       SPAC23H4.02     ppk9    serine/threonine protein kinase Ppk9    O13945
PTHR48100       SPBPB21E7.02c           phosphoglycerate mutase/6-phosphofructo-2-kinase family U3H041
PTHR43731       SPBP4H10.10     rbd3    mitochondrial rhomboid family peptidase Rbd3    Q9P7D8
PTHR31162       SPCC794.06              transmembrane transporter       O59815
PTHR31162       SPAPB8E5.03     mae1    plasma membrane malate/succinate:proton symporter Mae1  P50537
PTHR45808       SPBC354.13      rga6    RhoGAP for Cdc42, Rga6  O43027
PTHR42791       SPAC56E4.07             N-acetyltransferase     O14195
PTHR11802       SPAC1296.03c    sxa2    serine carboxypeptidase Sxa2    P32825
PTHR13355       SPAC11D3.02c            ELLA family acetyltransferase   Q10081
PTHR31560       SPAC22H10.08            DUF2009 family protein, conserved in yeast and apicomplexa      Q10301
PTHR31560       SPCC16A11.03c           DUF2009 family protein, conserved in yeast and apicomplexa      Q9USN2
PTHR36206       SPBC15D4.02     gsf1    DNA-binding transcription factor, zf-fungal binuclear cluster type Gsf1 O74308
PTHR36206       SPAC821.07c     moc3    DNA-binding transcription factor Moc3   Q9UT46
PTHR43172       SPBC8E4.05c             fumarate lyase superfamily, unknown specificity, bacterial 3-carboxy-cis,cis-muconate cycloisomerase related    O42889
PTHR32268       SPBC106.17c     cys2    serine O-acetyltransferase/serine O-succinyltransferase Cys2    Q10341
PTHR42912       SPAC1B3.06c             UbiE family methyltransferase   O13871
PTHR23327       SPAC6B12.07c    pqr1    ubiquitin-protein ligase E3, phosphate quantity regulator Pqr1/Spx1     O14212
PTHR24324       SPAC32A11.03c   phx1    DNA-binding transcription factor, stationary phase-specific Phx1        Q10328
PTHR16301       SPBC14C8.09c    dbl3    translation inhibitor, IMPACT domain protein    O60090
PTHR23065       SPAC1952.16     rga9    RhoGAP Rga9     Q9UUJ3
PTHR47219       SPAC4G8.04              RabGAP  Q09830
PTHR43364       SPAC750.01              NADP-dependent aldo/keto reductase, unknown biological role, implicated in cellular detoxification      G2TRN6
PTHR47640       SPBC1289.12     usp109  U1 snRNP-associated protein Usp109      O94621
PTHR31121       SPAC959.04c     omh6    alpha-1,2-mannosyltransferase Omh6      Q9P4X2
PTHR31616       SPAC4H3.03c             glucan 1,4-alpha-glucosidase    Q10211
PTHR10000       SPBC215.10      odr1    HAD superfamily hydrolase, unknown role O94314
PTHR10000       SPAC25B8.12c            HAD superfamily hydrolase, unknown role Q9UTA6
PTHR11073       SPBP4H10.19c            calreticulin/calnexin homolog   Q9P7D0
PTHR47107       SPBC36B7.02     svf2    ER to Golgi ceramide transport protein, lipocalin superfamily Svf2      Q9HGN8
PTHR24012       SPAC1610.03c    crp79   poly(A) binding protein Crp79   Q9P6M8
PTHR24012       SPAC343.07      mug28   RNA-binding protein Mug28, implicated in mRNA processing        Q9UT83
PTHR46910       SPBC530.05      prt1    DNA-binding transcription factor Prt1   O59741
PTHR46910       SPBC530.08      ntu2    DNA-binding transcription factor, membrane-tethered     O59744
PTHR46910       SPBC1773.12             DNA-binding transcription factor, zf-fungal binuclear cluster type      O94569
PTHR46910       SPBC1773.16c            DNA-binding transcription factor, zf-fungal binuclear cluster type      O94573
PTHR46910       SPAC11D3.07c    toe4    DNA-binding transcription factor, zf-fungal binuclear cluster type      Q10086
PTHR46910       SPAPB24D3.01    toe3    DNA-binding transcription factor        Q9C0Z1
PTHR24031       SPBC25D12.06    mrh5    mitochondrial ATP-dependent RNA helicase Mrh5   O74356
PTHR11188       SPBC557.05              arrestin, meiosis specific      Q9USS1
PTHR31145       SPCC1322.03     trp1322 plasma membrane TRP-like calcium ion channel Trp1322    O94543
PTHR10177       SPBC2G2.09c     crs1    meiosis specific cyclin Crs1    O43008
PTHR10177       SPBC19F5.01c    puc1    G1 cyclin Puc1  P25009
PTHR14089       SPCC74.09       mug24   RNA-binding protein, rrm type   O13674
PTHR10357       SPAC27E2.01             alpha-amylase homolog   O13996
PTHR10357       SPAC25H1.09     mde5    alpha-amylase homolog Mde5      O14154
PTHR10357       SPBC16A3.13     meu7    alpha-amylase homolog Aah4      O42918
PTHR10357       SPCC757.12      aah1    cell wall alpha-amylase homolog Aah1    O74922
PTHR10357       SPAC23D3.14c    aah2    alpha-amylase homolog Aah2      Q09840
PTHR10357       SPCC11E10.09c           alpha-amylase homolog   Q10427
PTHR10357       SPCC63.02c      aah3    cell wall alpha-amylase homolog Aah3    Q9Y7S9
PTHR40626       SPAC11D3.17             DNA-binding transcription factor, zf-fungal binuclear cluster type      Q10096
PTHR31001       SPAC1F7.11c             DNA-binding transcription factor, zf-fungal binuclear cluster type      Q09922
PTHR31001       SPAC139.03      toe2    DNA-binding transcription factor, zf-fungal binuclear cluster type Toe2 Q9UTN0
PTHR18898       SPAC1486.04c    alm1    nucleoporin Alm1        Q9UTK5
PTHR12992       SPAC14C4.10c            Nudix family hydrolase  O13717
PTHR31962       SPBC1347.03     meu14   sporulation specific PIL domain protein Meu14   O94756
PTHR10270       SPBC1711.02     mat3-Mc mating type M-specific HMG-box DNA-binding transcription factor Mc at silenced MAT3 locus       P0CY16
PTHR10270       SPBC23G7.09     mat1-Mc M-specific trancription factor Mc       P0CY17
PTHR43991       SPBC2A9.03              WD40/YVTN repeat-like protein   Q9Y7K5
PTHR10071       SPCC1393.08     fil1    DNA-binding transcription factor, zf-GATA-type, amino acid sensing, Fil1        O94720
PTHR43976       SPCC162.03              short chain dehydrogenase       O74628
PTHR43976       SPBC1348.09             short chain dehydrogenase, implicated in cellular detoxification        P0CU01
PTHR19304       SPBC29B5.01     atf1    DNA-binding transcription factor, Atf-CREB family Atf1  P52890
PTHR19304       SPBC2F12.09c    atf21   DNA-binding transcription factor, Atf-CREB family Atf21 P78962
PTHR19304       SPAC22F3.02     atf31   DNA-binding transcription factor Atf31  Q09771
PTHR19304       SPAC21E11.03c   pcr1    DNA-binding transcription factor Pcr1   Q09926
ValWood commented 2 weeks ago

Perfect, thanks. I will work through these.