Open ValWood opened 2 weeks ago
I have a large list of >277 genes that I still need to check for paralogs
Could you link to the list?
oops!
now down to 265 https://www.pombase.org/results/from/id/311eb4b3-506f-46b6-a460-07384637996a
Do you know of a pombe gene ID to Panther iD mapping file?
From the IntePRo file. I was accessing them from the domain section
From the IntePRo file.
Thanks.
If a panther family has 2 members, they would be paralogs
Do you mean 2 members that are in your list of 265?
Yep
Although I'm interested if there are members outside of the 265 (but PANTHER sometimes missassigns, although, so do I!)
These are the Panther IDs with 2 or more genes in your list. I hope this is what you had in mind:
PTHR24012 SPAC1610.03c crp79 poly(A) binding protein Crp79 Q9P6M8
PTHR24012 SPAC343.07 mug28 RNA-binding protein Mug28, implicated in mRNA processing Q9UT83
PTHR31492 SPBC21D10.06c map4 cell surface adhesion protein for conjugation Map4 O74346
PTHR31492 SPBC1348.08c cell surface glycoprotein, adhesion molecule P0CU04
PTHR31492 SPAC977.07c pfl6 cell surface glycoprotein, flocculin Pfl6 P0CU05
PTHR31492 SPCC188.09c pfl4 cell surface glycoprotein, flocculin Pfl4 Q7Z9I1
PTHR31492 SPAC1F8.06 pfl8 cell surface glycoprotein, flocculin Pfl8 Q92344
PTHR31492 SPAPB2C8.01 cell surface glycoprotein, adhesion molecule Q9C0Y2
PTHR31492 SPAP11E10.02c mam3 cell surface adhesion protein for conjugation Mam3 Q9HDY9
PTHR31001 SPAC1F7.11c DNA-binding transcription factor, zf-fungal binuclear cluster type Q09922
PTHR31001 SPAC139.03 toe2 DNA-binding transcription factor, zf-fungal binuclear cluster type Toe2 Q9UTN0
PTHR19304 SPBC29B5.01 atf1 DNA-binding transcription factor, Atf-CREB family Atf1 P52890
PTHR19304 SPBC2F12.09c atf21 DNA-binding transcription factor, Atf-CREB family Atf21 P78962
PTHR19304 SPAC22F3.02 atf31 DNA-binding transcription factor Atf31 Q09771
PTHR19304 SPAC21E11.03c pcr1 DNA-binding transcription factor Pcr1 Q09926
PTHR10177 SPBC2G2.09c crs1 meiosis specific cyclin Crs1 O43008
PTHR10177 SPBC19F5.01c puc1 G1 cyclin Puc1 P25009
PTHR46910 SPBC530.05 prt1 DNA-binding transcription factor Prt1 O59741
PTHR46910 SPBC530.08 ntu2 DNA-binding transcription factor, membrane-tethered O59744
PTHR46910 SPBC1773.12 DNA-binding transcription factor, zf-fungal binuclear cluster type O94569
PTHR46910 SPBC1773.16c DNA-binding transcription factor, zf-fungal binuclear cluster type O94573
PTHR46910 SPAC11D3.07c toe4 DNA-binding transcription factor, zf-fungal binuclear cluster type Q10086
PTHR46910 SPAPB24D3.01 toe3 DNA-binding transcription factor Q9C0Z1
PTHR10000 SPBC215.10 odr1 HAD superfamily hydrolase, unknown role O94314
PTHR10000 SPAC25B8.12c HAD superfamily hydrolase, unknown role Q9UTA6
PTHR18884 SPBC19F8.01c spn7 meiotic septin Spn7 O60165
PTHR18884 SPAC24C9.15c spn5 meiotic septin Spn5 P48010
PTHR31162 SPCC794.06 transmembrane transporter O59815
PTHR31162 SPAPB8E5.03 mae1 plasma membrane malate/succinate:proton symporter Mae1 P50537
PTHR36206 SPBC15D4.02 gsf1 DNA-binding transcription factor, zf-fungal binuclear cluster type Gsf1 O74308
PTHR36206 SPAC821.07c moc3 DNA-binding transcription factor Moc3 Q9UT46
PTHR10270 SPBC1711.02 mat3-Mc mating type M-specific HMG-box DNA-binding transcription factor Mc at silenced MAT3 locus P0CY16
PTHR10270 SPBC23G7.09 mat1-Mc M-specific trancription factor Mc P0CY17
PTHR43008 SPAC22A12.17c short chain dehydrogenase O13908
PTHR43008 SPCC1739.08c short chain dehydrogenase O74470
PTHR23502 SPAC977.04 truncated C terminal region of membrane transporter G2TRN8
PTHR23502 SPBC609.04 caf5 plasma membrane spermine family transmembrane transporter Caf5 O94528
PTHR43625 SPBC215.11c aldo/keto reductase, unknown biological role O94315
PTHR43625 SPAC1F7.12 yak3 aldose reductase ARK13 family YakC, implicated in cellular detoxification from family members Q09923
PTHR47338 SPBC530.11c DNA-binding transcription factor, zf-fungal binuclear cluster type O59746
PTHR47338 SPAC1327.01c DNA-binding transcription factor, zf-fungal binuclear cluster type Q1MTM9
PTHR47338 SPAPB1A11.04c mca1 DNA-binding transcription factor, zf-fungal binuclear cluster type, meiosis-specific copper activator, Mca1 Q9HDX1
DONE ~PTHR31323 SPAC2C4.17c msy2 MS ion channel protein 2 O14050 PTHR31323 SPCC1183.11 msy1 MS calcium ion channel protein Msy1 O74839 PTHR43976 SPCC162.03 short chain dehydrogenase O74628 PTHR43976 SPBC1348.09 short chain dehydrogenase, implicated in cellular detoxification P0CU01 PTHR10357 SPAC27E2.01 alpha-amylase homolog O13996 PTHR10357 SPAC25H1.09 mde5 alpha-amylase homolog Mde5 O14154 PTHR10357 SPBC16A3.13 meu7 alpha-amylase homolog Aah4 O42918 PTHR10357 SPCC757.12 aah1 cell wall alpha-amylase homolog Aah1 O74922 PTHR10357 SPAC23D3.14c aah2 alpha-amylase homolog Aah2 Q09840 PTHR10357 SPCC11E10.09c alpha-amylase homolog Q10427 PTHR10357 SPCC63.02c aah3 cell wall alpha-amylase homolog Aah3 Q9Y7S9 PTHR47990 SPAC25B8.13c isp7 2-OG-Fe(II) oxygenase superfamily protein P40902 PTHR47990 SPCC1494.01 iron/ascorbate oxidoreductase family Q7LL04 PTHR31560 SPAC22H10.08 DUF2009 family protein, conserved in yeast and apicomplexa Q10301 PTHR31560 SPCC16A11.03c DUF2009 family protein, conserved in yeast and apicomplexa Q9USN2~
Although I'm interested if there are members outside of the 265 (but PANTHER sometimes missassigns, although, so do I!)
This is the list of Panther IDs and genes from your list where the Panther ID matches more than 1 gene:
PTHR43008 SPAC22A12.17c short chain dehydrogenase O13908
PTHR43008 SPCC1739.08c short chain dehydrogenase O74470
PTHR31492 SPBC21D10.06c map4 cell surface adhesion protein for conjugation Map4 O74346
PTHR31492 SPBC1348.08c cell surface glycoprotein, adhesion molecule P0CU04
PTHR31492 SPAC977.07c pfl6 cell surface glycoprotein, flocculin Pfl6 P0CU05
PTHR31492 SPCC188.09c pfl4 cell surface glycoprotein, flocculin Pfl4 Q7Z9I1
PTHR31492 SPAC1F8.06 pfl8 cell surface glycoprotein, flocculin Pfl8 Q92344
PTHR31492 SPAPB2C8.01 cell surface glycoprotein, adhesion molecule Q9C0Y2
PTHR31492 SPAP11E10.02c mam3 cell surface adhesion protein for conjugation Mam3 Q9HDY9
PTHR23502 SPAC977.04 truncated C terminal region of membrane transporter G2TRN8
PTHR23502 SPBC609.04 caf5 plasma membrane spermine family transmembrane transporter Caf5 O94528
PTHR23092 SPCC663.12 cid12 poly(A) polymerase Cid12 O74518
PTHR12271 SPAC821.04c cid13 cytoplasmic poly(A) polymerase Cid13 Q9UT49
PTHR47990 SPAC25B8.13c isp7 2-OG-Fe(II) oxygenase superfamily protein P40902
PTHR47990 SPCC1494.01 iron/ascorbate oxidoreductase family Q7LL04
PTHR43173 SPAC14C4.09 agn1 cell wall glucan endo-1,3-alpha-glucosidase Agn1 O13716
PTHR43249 SPBC12C2.04 NAD binding dehydrogenase family protein Q09745
PTHR11986 SPAC27F1.05c aminotransferase class-III, unknown specificity Q10174
PTHR48012 SPAC12B10.14c tea5 pseudokinase Tea5 Q10447
PTHR13710 SPAC212.06c DNA helicase in rearranged telomeric region, truncated G2TRN7
PTHR47263 SPBC21C3.20c git1 C2 domain protein Git1 Q9P7K5
PTHR46517 SPCC1620.13 phosphoglycerate mutase/6-phosphofructo-2-kinase family O94420
PTHR31323 SPAC2C4.17c msy2 MS ion channel protein 2 O14050
PTHR31323 SPCC1183.11 msy1 MS calcium ion channel protein Msy1 O74839
PTHR43625 SPBC215.11c aldo/keto reductase, unknown biological role O94315
PTHR43625 SPAC1F7.12 yak3 aldose reductase ARK13 family YakC, implicated in cellular detoxification from family members Q09923
PTHR18884 SPBC19F8.01c spn7 meiotic septin Spn7 O60165
PTHR18884 SPAC24C9.15c spn5 meiotic septin Spn5 P48010
PTHR10644 SPCC11E10.08 rik1 CLRC ubiquitin ligase complex WD repeat subunit Rik1 Q10426
PTHR47338 SPBC530.11c DNA-binding transcription factor, zf-fungal binuclear cluster type O59746
PTHR47338 SPAC1327.01c DNA-binding transcription factor, zf-fungal binuclear cluster type Q1MTM9
PTHR47338 SPAPB1A11.04c mca1 DNA-binding transcription factor, zf-fungal binuclear cluster type, meiosis-specific copper activator, Mca1 Q9HDX1
PTHR14430 SPCC1183.12 spo13 sporulation specific RabGEF Spo13 C6Y4C9
PTHR12304 SPBC800.11 inosine-uridine preferring nucleoside hydrolase Q9HGL1
PTHR24346 SPAC23H4.02 ppk9 serine/threonine protein kinase Ppk9 O13945
PTHR48100 SPBPB21E7.02c phosphoglycerate mutase/6-phosphofructo-2-kinase family U3H041
PTHR43731 SPBP4H10.10 rbd3 mitochondrial rhomboid family peptidase Rbd3 Q9P7D8
PTHR31162 SPCC794.06 transmembrane transporter O59815
PTHR31162 SPAPB8E5.03 mae1 plasma membrane malate/succinate:proton symporter Mae1 P50537
PTHR45808 SPBC354.13 rga6 RhoGAP for Cdc42, Rga6 O43027
PTHR42791 SPAC56E4.07 N-acetyltransferase O14195
PTHR11802 SPAC1296.03c sxa2 serine carboxypeptidase Sxa2 P32825
PTHR13355 SPAC11D3.02c ELLA family acetyltransferase Q10081
PTHR31560 SPAC22H10.08 DUF2009 family protein, conserved in yeast and apicomplexa Q10301
PTHR31560 SPCC16A11.03c DUF2009 family protein, conserved in yeast and apicomplexa Q9USN2
PTHR36206 SPBC15D4.02 gsf1 DNA-binding transcription factor, zf-fungal binuclear cluster type Gsf1 O74308
PTHR36206 SPAC821.07c moc3 DNA-binding transcription factor Moc3 Q9UT46
PTHR43172 SPBC8E4.05c fumarate lyase superfamily, unknown specificity, bacterial 3-carboxy-cis,cis-muconate cycloisomerase related O42889
PTHR32268 SPBC106.17c cys2 serine O-acetyltransferase/serine O-succinyltransferase Cys2 Q10341
PTHR42912 SPAC1B3.06c UbiE family methyltransferase O13871
PTHR23327 SPAC6B12.07c pqr1 ubiquitin-protein ligase E3, phosphate quantity regulator Pqr1/Spx1 O14212
PTHR24324 SPAC32A11.03c phx1 DNA-binding transcription factor, stationary phase-specific Phx1 Q10328
PTHR16301 SPBC14C8.09c dbl3 translation inhibitor, IMPACT domain protein O60090
PTHR23065 SPAC1952.16 rga9 RhoGAP Rga9 Q9UUJ3
PTHR47219 SPAC4G8.04 RabGAP Q09830
PTHR43364 SPAC750.01 NADP-dependent aldo/keto reductase, unknown biological role, implicated in cellular detoxification G2TRN6
PTHR47640 SPBC1289.12 usp109 U1 snRNP-associated protein Usp109 O94621
PTHR31121 SPAC959.04c omh6 alpha-1,2-mannosyltransferase Omh6 Q9P4X2
PTHR31616 SPAC4H3.03c glucan 1,4-alpha-glucosidase Q10211
PTHR10000 SPBC215.10 odr1 HAD superfamily hydrolase, unknown role O94314
PTHR10000 SPAC25B8.12c HAD superfamily hydrolase, unknown role Q9UTA6
PTHR11073 SPBP4H10.19c calreticulin/calnexin homolog Q9P7D0
PTHR47107 SPBC36B7.02 svf2 ER to Golgi ceramide transport protein, lipocalin superfamily Svf2 Q9HGN8
PTHR24012 SPAC1610.03c crp79 poly(A) binding protein Crp79 Q9P6M8
PTHR24012 SPAC343.07 mug28 RNA-binding protein Mug28, implicated in mRNA processing Q9UT83
PTHR46910 SPBC530.05 prt1 DNA-binding transcription factor Prt1 O59741
PTHR46910 SPBC530.08 ntu2 DNA-binding transcription factor, membrane-tethered O59744
PTHR46910 SPBC1773.12 DNA-binding transcription factor, zf-fungal binuclear cluster type O94569
PTHR46910 SPBC1773.16c DNA-binding transcription factor, zf-fungal binuclear cluster type O94573
PTHR46910 SPAC11D3.07c toe4 DNA-binding transcription factor, zf-fungal binuclear cluster type Q10086
PTHR46910 SPAPB24D3.01 toe3 DNA-binding transcription factor Q9C0Z1
PTHR24031 SPBC25D12.06 mrh5 mitochondrial ATP-dependent RNA helicase Mrh5 O74356
PTHR11188 SPBC557.05 arrestin, meiosis specific Q9USS1
PTHR31145 SPCC1322.03 trp1322 plasma membrane TRP-like calcium ion channel Trp1322 O94543
PTHR10177 SPBC2G2.09c crs1 meiosis specific cyclin Crs1 O43008
PTHR10177 SPBC19F5.01c puc1 G1 cyclin Puc1 P25009
PTHR14089 SPCC74.09 mug24 RNA-binding protein, rrm type O13674
PTHR10357 SPAC27E2.01 alpha-amylase homolog O13996
PTHR10357 SPAC25H1.09 mde5 alpha-amylase homolog Mde5 O14154
PTHR10357 SPBC16A3.13 meu7 alpha-amylase homolog Aah4 O42918
PTHR10357 SPCC757.12 aah1 cell wall alpha-amylase homolog Aah1 O74922
PTHR10357 SPAC23D3.14c aah2 alpha-amylase homolog Aah2 Q09840
PTHR10357 SPCC11E10.09c alpha-amylase homolog Q10427
PTHR10357 SPCC63.02c aah3 cell wall alpha-amylase homolog Aah3 Q9Y7S9
PTHR40626 SPAC11D3.17 DNA-binding transcription factor, zf-fungal binuclear cluster type Q10096
PTHR31001 SPAC1F7.11c DNA-binding transcription factor, zf-fungal binuclear cluster type Q09922
PTHR31001 SPAC139.03 toe2 DNA-binding transcription factor, zf-fungal binuclear cluster type Toe2 Q9UTN0
PTHR18898 SPAC1486.04c alm1 nucleoporin Alm1 Q9UTK5
PTHR12992 SPAC14C4.10c Nudix family hydrolase O13717
PTHR31962 SPBC1347.03 meu14 sporulation specific PIL domain protein Meu14 O94756
PTHR10270 SPBC1711.02 mat3-Mc mating type M-specific HMG-box DNA-binding transcription factor Mc at silenced MAT3 locus P0CY16
PTHR10270 SPBC23G7.09 mat1-Mc M-specific trancription factor Mc P0CY17
PTHR43991 SPBC2A9.03 WD40/YVTN repeat-like protein Q9Y7K5
PTHR10071 SPCC1393.08 fil1 DNA-binding transcription factor, zf-GATA-type, amino acid sensing, Fil1 O94720
PTHR43976 SPCC162.03 short chain dehydrogenase O74628
PTHR43976 SPBC1348.09 short chain dehydrogenase, implicated in cellular detoxification P0CU01
PTHR19304 SPBC29B5.01 atf1 DNA-binding transcription factor, Atf-CREB family Atf1 P52890
PTHR19304 SPBC2F12.09c atf21 DNA-binding transcription factor, Atf-CREB family Atf21 P78962
PTHR19304 SPAC22F3.02 atf31 DNA-binding transcription factor Atf31 Q09771
PTHR19304 SPAC21E11.03c pcr1 DNA-binding transcription factor Pcr1 Q09926
Perfect, thanks. I will work through these.
I have a large list of >277 genes that I still need to check for paralogs
This is quite time confusing but I have a heuristic!
Could you extract the PANTHER families for these? If a panther family has 2 members, they would be paralogs If a panther family has 1 member it can be removed from this list
(that should sort most of them)