matthiaskoenig / brendapy

BRENDA parser in python
GNU Lesser General Public License v3.0
19 stars 6 forks source link

Resolve uniprot/swissprot information when provided #28

Closed matthiaskoenig closed 5 years ago

matthiaskoenig commented 5 years ago

Some entries provide uniprot identifiers. These could be resolved from the entry string using regular expression (how to match optional groups?)

matthiaskoenig commented 5 years ago

Many proteins have uniprot/swissprot identifiers in the PR entries. E.g. EC 1.1.1.1

...
PR  #109# Homo sapiens P08319 UniProt <214>
PR  #110# Parageobacillus thermoglucosidasius Q6RS93 UniProt <210>
PR  #111# Thermoplasma acidophilum Q9HIM3 UniProt <213>
PR  #112# Thermus sp. B2ZRE3 UniProt <197>
PR  #113# Saccharomyces pastorianus B6UQD0 UniProt <202>
PR  #114# Geobacillus thermodenitrificans A4IP64 UniProt <215>
PR  #115# Geobacillus thermodenitrificans A4ISB9 UniProt <215>
PR  #116# Mus musculus Q9QYY9  <110>
PR  #117# Saccharomyces cerevisiae P00330  <193,205,209>
PR  #118# Equus caballus P00327  <111,175,205>
PR  #119# Geobacillus stearothermophilus P42328 
    <112,176,246,256,257,258,260>
PR  #119# Geobacillus stearothermophilus P42328 UniProt
    <112,176,246,256,257,258,260>
PR  #120# Saccharomyces cerevisiae P00331  <170>
PR  #121# Euglena gracilis B8QU18  <211>
PR  #122# Sulfurisphaera tokodaii F9VMI9 SwissProt <217>
PR  #123# Sulfolobus acidocaldarius Q4J702 UniProt <219>
PR  #124# Sulfolobus acidocaldarius Q4J9F2 UniProt <218,219>
PR  #125# Glycine max Q9ZT38 UniProt <233>
PR  #126# Bombyx mori Q1G151 UniProt <231>
PR  #127# Ogataea angusta H9ZGN0 UniProt <222>
PR  #128# Candida maris   <225>
PR  #129# Pyrococcus furiosus Q8U259 SwissProt <230>
...