SBRG / bigg_models

The BiGG Models website server
http://bigg.ucsd.edu
Other
77 stars 18 forks source link

Different reaction IDs for identical reactions #280

Open willigott opened 6 years ago

willigott commented 6 years ago

Description of the issue

Identical reactions (according to the annotation) that might differ regarding the proton stoichiometry (which implies that one of the reactions is also unbalanced; e.g. FMNAT and AFAT).

A second case is where the metabolite IDs differ although it is the same compound (see issue #274, e.g. UNK3 and ARAT)

A few more candidates; I only checked whether i) the IDs are "an_ID" and "an_ID_1" ii) their Metanetx ID is identical iii) they occur in the same compartment

Did not check all of them manually afterwards:

[('3SPYRSPm_1', '3SPYRSPm'),
 ('4HOXPACMOF_1', '4HOXPACMOF'),
 ('4HOXPACt2pp_1', '4HOXPACt2pp'),
 ('ABTt_1', 'ABTt'),
 ('ACODA_1', 'ACODA'),
 ('ACPS1_1', 'ACPS1'),
 ('AKP1_1', 'AKP1'),
 ('ALDD31_1', 'ALDD31'),
 ('ARD_1', 'ARD'),
 ('ARGN_1', 'ARGN'),
 ('ARGSS_1', 'ARGSS'),
 ('ASNTRS_1', 'ASNTRS'),
 ('ASPCT_1', 'ASPCT'),
 ('ASPO3_1', 'ASPO3'),
 ('ASPO4_1', 'ASPO4'),
 ('ASPO5_1', 'ASPO5'),
 ('ASPO6_1', 'ASPO6'),
 ('ASPOcm_1', 'ASPOcm'),
 ('DHBSH_1', 'DHBSH'),
 ('DHCINDO_1', 'DHCINDO'),
 ('DHFS_1', 'DHFS'),
 ('DHNPA_1', 'DHNPA'),
 ('DHPS2_1', 'DHPS2'),
 ('DNTPPA_1', 'DNTPPA'),
 ('ECOAH1_1', 'ECOAH1'),
 ('ECOAH5_1', 'ECOAH5'),
 ('ENTCS_1', 'ENTCS'),
 ('FGLU_1', 'FGLU'),
 ('FLVR_1', 'FLVR'),
 ('FMETTRS_1', 'FMETTRS'),
 ('FOLD3_1', 'FOLD3'),
 ('FOLR2_1', 'FOLR2'),
 ('FTHFI_1', 'FTHFI'),
 ('G6PDA_1', 'G6PDA'),
 ('GALCTD_1', 'GALCTD'),
 ('GF6PTA_1', 'GF6PTA'),
 ('GLCRAL_1', 'GLCRAL'),
 ('GLCRD_1', 'GLCRD'),
 ('GMPtn_1', 'GMPtn'),
 ('GTPCI_1', 'GTPCI'),
 ('GTPDPK_1', 'GTPDPK'),
 ('GUAD_1', 'GUAD'),
 ('HACD1_1', 'HACD1'),
 ('HEMEAS_1', 'HEMEAS'),
 ('HEMEOS_1', 'HEMEOS'),
 ('HEMEOSm_1', 'HEMEOSm'),
 ('HKNDDH_1', 'HKNDDH'),
 ('HKNTDH_1', 'HKNTDH'),
 ('HP5CD_1', 'HP5CD'),
 ('HPPK_1', 'HPPK'),
 ('HPPPNDO_1', 'HPPPNDO'),
 ('IG3PS_1', 'IG3PS'),
 ('IGPDH_1', 'IGPDH'),
 ('IZPN_1', 'IZPN'),
 ('METS_1', 'METS'),
 ('MTHFC_1', 'MTHFC'),
 ('MTHFCm_1', 'MTHFCm'),
 ('MTHFD_1', 'MTHFD'),
 ('MTHFDm_1', 'MTHFDm'),
 ('MTHFR2_1', 'MTHFR2'),
 ('MTHFR3_1', 'MTHFR3'),
 ('NOS2_1', 'NOS2'),
 ('OCBT_1', 'OCBT'),
 ('ORNTAC_1', 'ORNTAC'),
 ('ORNTA_1', 'ORNTA'),
 ('PFK26_1', 'PFK26'),
 ('PRAIS_1', 'PRAIS'),
 ('PRAMPC_1', 'PRAMPC'),
 ('PRFGS_1', 'PRFGS'),
 ('PYDXDH_1', 'PYDXDH'),
 ('QULNS_1', 'QULNS'),
 ('SEAHCYSHYD_1', 'SEAHCYSHYD'),
 ('SELCYSLY2_1', 'SELCYSLY2'),
 ('SHCHD2_1', 'SHCHD2'),
 ('SULR_1', 'SULR'),
 ('THFGLUS_1', 'THFGLUS'),
 ('URCN_1', 'URCN')]

Page

http://bigg.ucsd.edu/universal/reactions/FMNAT http://bigg.ucsd.edu/universal/reactions/AFAT

http://bigg.ucsd.edu/universal/reactions/UNK3 http://bigg.ucsd.edu/universal/reactions/ARAT

http://bigg.ucsd.edu/universal/reactions/APPAT http://bigg.ucsd.edu/universal/reactions/PTPATi

http://bigg.ucsd.edu/universal/reactions/SULR http://bigg.ucsd.edu/universal/reactions/SULR_1

http://bigg.ucsd.edu/universal/reactions/IMACTD http://bigg.ucsd.edu/universal/reactions/ALDD20x_1

http://bigg.ucsd.edu/universal/reactions/OCBT http://bigg.ucsd.edu/universal/reactions/OCBT_1 http://bigg.ucsd.edu/universal/reactions/OCT

http://bigg.ucsd.edu/universal/reactions/PFK26 http://bigg.ucsd.edu/universal/reactions/PFK26_1

willigott commented 6 years ago

A more complete overview:

Table of all reactions for which another BiGG reaction exists that has i) the same Metanetx ID ii) occurs in the same compartment (based on IDs of the metabolites)

metanetx_id bigg_id
MNXR100010 GALS4
MNXR100010 STACHGALACT
MNXR100054 GBEZ
MNXR100054 GBEZY
MNXR100067 GCCam
MNXR100067 GLYDHD
MNXR100068 GCCbim
MNXR100068 MTAM_nh4
MNXR100078 GCPN
MNXR100078 PDE4
MNXR100085 GDH1
MNXR100085 GLUDxi
MNXR100099 GDTP
MNXR100099 GTPDPDP
MNXR100107 GF6PTA
MNXR100107 GF6PTA_1
MNXR100155 GLBRAN2
MNXR100155 GLDBRAN2
MNXR100209 GLCNt2ir
MNXR100209 GLCNt2r
MNXR100219 GLCRAL
MNXR100219 GLCRAL_1
MNXR100220 GLCRD
MNXR100220 GLCRD_1
MNXR100283 GLUK
MNXR100283 GLUK_syn
MNXR100301 GLUt5m
MNXR100301 GLUt7m
MNXR100303 GLXCBL
MNXR100303 GLXCL
MNXR100330 GLYCL
MNXR100330 GLYCL_2
MNXR100355 GLYOX_1
MNXR100355 LGTHL
MNXR100381 GMP5N
MNXR100381 NTD9
MNXR100386 GMPtn
MNXR100386 GMPtn_1
MNXR100390 GNK
MNXR100390 GNKr
MNXR100409 GPAR
MNXR100409 GUAPRT
MNXR100453 GTPCI
MNXR100453 GTPCI_1
MNXR100453 GTPCI_2
MNXR100457 GTPDPK
MNXR100457 GTPDPK_1
MNXR100463 GUAC
MNXR100463 GUACYC
MNXR100464 GUAD
MNXR100464 GUAD_1
MNXR100482 H2CO3D
MNXR100482 HCO3E
MNXR100482 H2CO3Dm
MNXR100482 HCO3Em
MNXR100541 HACD1
MNXR100541 HACD1_1
MNXR100541 HACD1i
MNXR100543 HACD2
MNXR100543 HACD2i
MNXR100544 HACD3
MNXR100544 HACD3i
MNXR100545 HACD4
MNXR100545 HACD4i
MNXR100546 HACD5
MNXR100546 HACD5i
MNXR100547 HACD6
MNXR100547 HACD6i
MNXR100548 HACD7
MNXR100548 HACD7i
MNXR100549 HACD8
MNXR100549 HACD8i
MNXR100581 HDC
MNXR100581 HISDC
MNXR100587 HDH
MNXR100587 HISTD
MNXR100591 HEMEAS
MNXR100591 HEMEAS_1
MNXR100596 HEMEOS
MNXR100596 HEMEOS_1
MNXR100596 HEMEOSm
MNXR100596 HEMEOSm_1
MNXR100642 HISTP
MNXR100642 HP
MNXR100654 HKNDDH
MNXR100654 HKNDDH_1
MNXR100655 HKNTDH
MNXR100655 HKNTDH_1
MNXR100657 HKt
MNXR100657 Kabc
MNXR100657 HKtpp
MNXR100657 Kabcpp
MNXR100660 HMGCOAS
MNXR100660 MHGS
MNXR100662 HMGL_1
MNXR100662 HMGLm
MNXR100682 HOXPRx
MNXR100682 TRSARr
MNXR100683 HP5CD
MNXR100683 HP5CD_1
MNXR100692 HPI
MNXR100692 HPYRI
MNXR100693 HPPK
MNXR100693 HPPK_1
MNXR100695 HPPPNDO
MNXR100695 HPPPNDO_1
MNXR100698 HPROa
MNXR100698 HPROx
MNXR100764 HYPTROX
MNXR100764 HYPTROX_cho
MNXR100808 IDPm
MNXR100808 PPAm
MNXR100811 IG3PS
MNXR100811 IG3PS_1
MNXR100813 IGPDH
MNXR100813 IGPDH_1
MNXR100835 INDOLEt2pp
MNXR100835 INDOLEt2rpp
MNXR100877 IPKK
MNXR100877 MI13456PK
MNXR100877 PMI1346PS
MNXR100887 ITK1K
MNXR100887 MI3456PK
MNXR100896 IZPN
MNXR100896 IZPN_1
MNXR101007 LALDD
MNXR101007 LALDO2
MNXR101032 LCYSTCBOXL
MNXR101032 SALADC2
MNXR101081 LIPATPT
MNXR101081 RE1944C
MNXR101095 LLDH_ferr_m
MNXR101095 L_LACDm
MNXR101105 LNLCCPT1
MNXR101105 LNLCCPT2rbc
MNXR101345 MALMDA
MNXR101345 PYRZAM
MNXR101382 MAN6PI
MNXR101382 MPAKI
MNXR101411 MCCCrm
MNXR101411 MCTC
MNXR101412 MCD
MNXR101412 MCDC
MNXR101439 MDH
MNXR101439 MDHi2
MNXR101479 METOX1s
MNXR101479 METOX2s
MNXR101479 METSR_S2
MNXR101481 METS
MNXR101481 METS_1
MNXR101482 METSOX1abcpp
MNXR101482 METSOX2abcpp
MNXR101483 METSOX1tex
MNXR101483 METSOX2tex
MNXR101484 METSOXR1
MNXR101484 METSOXR2
MNXR101484 METSR_S1
MNXR101556 MI123456PP
MNXR101556 PHYT3
MNXR101556 PMI1346PH
MNXR101585 MINOHPtn
MNXR101585 PPMI1346Ptn
MNXR101748 MTHFC
MNXR101748 MTHFC_1
MNXR101748 MTHFCm
MNXR101748 MTHFCm_1
MNXR101749 MTHFD
MNXR101749 MTHFD_1
MNXR101749 MTHFDm
MNXR101749 MTHFDm_1
MNXR101750 MTHFD2
MNXR101750 MTHFD2i
MNXR101751 MTHFR2
MNXR101751 MTHFR2_1
MNXR101752 MTHFR3
MNXR101752 MTHFR3_1
MNXR101861 NADDP
MNXR101861 NPH
MNXR101881 NADHtpu
MNXR101881 NADtpu
MNXR101911 NAPT
MNXR101911 NMNS
MNXR101950 NH4t
MNXR101950 NH4ti
MNXR102007 NOS2
MNXR102007 NOS2_1
MNXR102108 OALT
MNXR102108 UDCPAT
MNXR102110 OAO4t3pp
MNXR102110 udcdpgalrmnmanabetpp
MNXR102137 OCBT
MNXR102137 OCBT_1
MNXR102137 OCT
MNXR102137 OCBTm
MNXR102137 OCTm
MNXR102158 OGLT
MNXR102158 UDCPGT
MNXR102186 OMLT
MNXR102186 UDCPMT
MNXR102216 ORLT
MNXR102216 UDCPRT
MNXR102220 ORNTA
MNXR102220 ORNTA_1
MNXR102221 ORNTAC
MNXR102221 ORNTAC_1
MNXR102224 ORNt2
MNXR102224 ORNt2r
MNXR102302 P5CR
MNXR102302 PRO1y
MNXR102303 P5CRx
MNXR102303 PRO1x
MNXR102303 P5CRxm
MNXR102303 PRO1xm
MNXR102345 PANTS
MNXR102345 PBAL
MNXR102430 PDHe2r
MNXR102430 r0555
MNXR102508 PFK26
MNXR102508 PFK26_1
MNXR102871 PIcm
MNXR102871 PIt5m
MNXR103047 PNCDC
MNXR103047 r0580
MNXR103139 PPRGL
MNXR103139 PRAGSr
MNXR103154 PRAMPC
MNXR103154 PRAMPC_1
MNXR103157 PRAIS
MNXR103157 PRAIS_1
MNXR103157 PRFGCL
MNXR103165 PRFGS
MNXR103165 PRFGS_1
MNXR103310 PSUDS
MNXR103310 YUMPS
MNXR103360 PYDXDH
MNXR103360 PYDXDH_1
MNXR103400 QULNS
MNXR103400 QULNS_1
MNXR103639 RE2248C
MNXR103639 RE2249C
MNXR103746 RE2870C
MNXR103746 RE2871C
MNXR104302 SEAHCYSHYD
MNXR104302 SEAHCYSHYD_1
MNXR104312 SELCYSLY2
MNXR104312 SELCYSLY2_1
MNXR104322 SELNPS
MNXR104322 SELNPS_cho
MNXR104339 SERD_L
MNXR104339 SER_AL
MNXR104367 SGPL13
MNXR104367 r0786
MNXR104373 SHCHD2
MNXR104373 SHCHD2_1
MNXR104375 SHCHF
MNXR104375 SHCHF_2
MNXR104382 SHSL2
MNXR104382 SHSL2r
MNXR104442 SLDx
MNXR104442 SLDxi2
MNXR104503 SPS
MNXR104503 UFAGT
MNXR104635 SUCLm
MNXR104635 SUCOASm
MNXR104649 SULOm
MNXR104649 r0142
MNXR104650 SULR
MNXR104650 SULR_1
MNXR104808 THFGLUS
MNXR104808 THFGLUS_1
MNXR104822 THMMPt4
MNXR104822 THMMPtrbc
MNXR104822 THMMPtm
MNXR104822 THMMPtm_cho
MNXR104824 THMPPt2m
MNXR104824 THMPPtm
MNXR104824 THMPPtm_cho
MNXR105000 TYRTA
MNXR105000 TYRTAi
MNXR105049 UDPDOLPT
MNXR105049 UDPDOLPT_cho
MNXR105128 UNK2
MNXR105128 UNK2_cho
MNXR105137 UPP3MT
MNXR105137 UPP3MT_2
MNXR105149 URCB
MNXR105149 UREASE
MNXR105150 URCN
MNXR105150 URCN_1
MNXR105156 UREAt
MNXR105156 UREAt5
MNXR105203 VITD3t
MNXR105203 VITD3t2
MNXR105203 VITD3tm
MNXR105203 VITD3tm3
MNXR105237 XOLEST2te_cho
MNXR105237 XOLESTte_cho
MNXR105262 XYLTD_D
MNXR105262 r0784
MNXR94731 23DK5MPPISO
MNXR94731 ACRS
MNXR94731 DM1PE
MNXR94737 25HVITD2t
MNXR94737 25HVITD2tin
MNXR94737 25HVITD2tin_m
MNXR94737 25HVITD2tm
MNXR94739 25HVITD3t
MNXR94739 25HVITD3tin
MNXR94739 25HVITD3tin_m
MNXR94739 25HVITD3tm
MNXR94796 2HBO
MNXR94796 LDH2
MNXR94799 2HH24DDH
MNXR94799 2HH24DDH1
MNXR94815 2OH3K5MPPISO
MNXR94815 ENOPH
MNXR94835 34DHPACDO
MNXR94835 DHPDO
MNXR94978 3SALACBOXL
MNXR94978 3SALACBOXL_cho
MNXR94980 3SALATAim
MNXR94980 AATGm
MNXR94981 3SPYRSPm
MNXR94981 3SPYRSPm_1
MNXR94990 44MZYMMO
MNXR94990 C4STMO1
MNXR94999 4CMLCL_kt
MNXR94999 CMLDC
MNXR95000 4H2KPILY
MNXR95000 DHEDAA
MNXR95018 4HOXPACMOF
MNXR95018 4HOXPACMOF_1
MNXR95019 4HOXPACMON
MNXR95019 HPA3MO
MNXR95020 4HOXPACt2pp
MNXR95020 4HOXPACt2pp_1
MNXR95054 4hoxpactex
MNXR95054 HPAtex
MNXR95063 OPETDC
MNXR95063 OPTCCL
MNXR95104 6PHBG
MNXR95104 S6PG
MNXR95114 A1E
MNXR95114 GalMr
MNXR95133 AABHH
MNXR95133 r0466
MNXR95163 AATHB
MNXR95163 TREH
MNXR95190 ABTt
MNXR95190 ABTt_1
MNXR95194 ACACT1r
MNXR95194 KAT1
MNXR95222 ACCOALm
MNXR95222 PPACOALm
MNXR95227 ACDO
MNXR95227 ARD
MNXR95227 ARD_1
MNXR95377 ACODA
MNXR95377 ACODA_1
MNXR95403 ACPS1
MNXR95403 ACPS1_1
MNXR95403 r0368
MNXR95411 ACPpds
MNXR95411 r0366
MNXR95415 ACSERSULL
MNXR95415 SLCYSS
MNXR95429 ACt2ipp
MNXR95429 ACt2rpp
MNXR95501 AFAT
MNXR95501 FMNAT
MNXR95646 AIRCr
MNXR95646 PRAIC
MNXR95656 AKGDHe2r
MNXR95656 AKGDbm
MNXR95656 r0556
MNXR95657 AKGDam
MNXR95657 r0384
MNXR95665 AKP1
MNXR95665 AKP1_1
MNXR95713 ALCD21_D
MNXR95713 LCARR
MNXR95725 ALCD2ir
MNXR95725 ALCD2x
MNXR95728 ALCD4
MNXR95728 BTS
MNXR95729 ALCD4y
MNXR95729 BTS_nadph
MNXR95745 ALDD20x_1
MNXR95745 IMACTD
MNXR95752 ALDD31
MNXR95752 ALDD31_1
MNXR95815 AMID4
MNXR95815 AMID_1
MNXR95817 AMITKPn
MNXR95817 MI1456PKn
MNXR95841 ANNAT
MNXR95841 NMNAT
MNXR95841 ANNATn
MNXR95841 NMNATn
MNXR95850 AO
MNXR95850 HISTASE
MNXR95852 HSTPT
MNXR95852 HSTPTr
MNXR95864 APATi
MNXR95864 THPAT
MNXR95866 APCPT
MNXR95866 r0679
MNXR95892 APPAT
MNXR95892 PTPATi
MNXR95911 ARABR
MNXR95911 ARABRr
MNXR95923 ARAT
MNXR95923 UNK3
MNXR95941 ARGDI
MNXR95941 ARGDr
MNXR95945 ARGN
MNXR95945 ARGN_1
MNXR95949 ARGSS
MNXR95949 ARGSS_1
MNXR95966 ARTCOAL1
MNXR95966 ARTCOAL1_cho
MNXR95967 ARTCOAL2
MNXR95967 ARTCOAL2_cho
MNXR95968 ARTCOAL3
MNXR95968 ARTCOAL3_cho
MNXR96064 ASNTRS
MNXR96064 ASNTRS_1
MNXR96080 ASPCT
MNXR96080 ASPCT_1
MNXR96091 ASPO3
MNXR96091 ASPO3_1
MNXR96092 ASPO4
MNXR96092 ASPO4_1
MNXR96093 ASPO5
MNXR96093 ASPO5_1
MNXR96094 ASPO6
MNXR96094 ASPO6_1
MNXR96095 ASPOcm
MNXR96095 ASPOcm_1
MNXR96118 ATDGDm
MNXR96118 NDPK5m
MNXR96119 ATGDm
MNXR96119 NDPK1m
MNXR96131 ATPM
MNXR96131 NTP1
MNXR96229 LEUTA
MNXR96229 LEUTAi
MNXR96241 BGLA
MNXR96241 CLBH
MNXR96342 BTS3r
MNXR96342 BTS_1
MNXR96347 BUPN
MNXR96347 UPPN
MNXR96384 C160CPT1
MNXR96384 C160CPT2rbc
MNXR96396 C181CPT1
MNXR96396 C181CPT2rbc
MNXR96634 CERASE124er
MNXR96634 CERS124er
MNXR96635 CERASE126er
MNXR96635 CERS126er
MNXR96636 CERASE224er
MNXR96636 CERS224er
MNXR96637 CERASE226er
MNXR96637 CERS226er
MNXR96671 CHITPH
MNXR96671 DC6PH
MNXR96720 CHYA1
MNXR96720 CHYA2
MNXR96757 CKc
MNXR96757 CKc_cho
MNXR96757 CK
MNXR96757 CK_cho
MNXR96840 CODSCL8XI
MNXR96840 CPC8MM
MNXR96916 CRTNsyn
MNXR96916 CRTNsyn_cho
MNXR96926 CSNATirm
MNXR96926 CSNATm
MNXR96955 CXHY
MNXR96955 ZHY
MNXR96988 CYSAMO
MNXR96988 CYSAMO_cho
MNXR96990 CYSATm
MNXR96990 CYSTAm
MNXR96994 CYSDS
MNXR96994 TRPAS1
MNXR97007 CYSS
MNXR97007 CYSS_2
MNXR97151 DASCBH
MNXR97151 DASCBH_cho
MNXR97317 DGGH
MNXR97317 GALS3
MNXR97327 DGTPtm
MNXR97327 DNDPt62m
MNXR97369 DHBS
MNXR97369 DHBSr
MNXR97370 DHBSH
MNXR97370 DHBSH_1
MNXR97379 DHCINDO
MNXR97379 DHCINDO_1
MNXR97399 DHFOR
MNXR97399 r0512
MNXR97400 FOLR2
MNXR97400 FOLR2_1
MNXR97403 DHFS
MNXR97403 DHFS_1
MNXR97414 DHNPA
MNXR97414 DHNPA_1
MNXR97432 DHPD
MNXR97432 DHPM1
MNXR97439 DHPS2
MNXR97439 DHPS2_1
MNXR97439 FOLD3
MNXR97439 FOLD3_1
MNXR97450 DHRT_2mbcoa
MNXR97450 r0604
MNXR97452 DHRT_ivcoa
MNXR97452 r0656
MNXR97628 DNDPt10m
MNXR97628 DNDPt29m
MNXR97629 DNDPt11m
MNXR97629 DNDPt35m
MNXR97632 DNDPt14m
MNXR97632 DNDPt22m
MNXR97633 DNDPt15m
MNXR97633 DNDPt33m
MNXR97634 DNDPt16m
MNXR97634 DNDPt8m
MNXR97635 DNDPt17m
MNXR97635 DNDPt26m
MNXR97641 DNDPt23m
MNXR97641 DNDPt34m
MNXR97642 DNDPt24m
MNXR97642 DNDPt9m
MNXR97643 DNDPt25m
MNXR97643 DNDPt27m
MNXR97644 DNDPt28m
MNXR97644 DNDPt36m
MNXR97680 DNTPPA
MNXR97680 DNTPPA_1
MNXR97767 DPHAPC100
MNXR97767 PHAPC100
MNXR97768 DPHAPC120
MNXR97768 PHAPC120
MNXR97769 DPHAPC121
MNXR97769 PHAPC121
MNXR97770 DPHAPC140
MNXR97770 PHAPC140
MNXR97771 DPHAPC141
MNXR97771 PHAPC141
MNXR97772 DPHAPC60
MNXR97772 PHAPC60
MNXR97773 DPHAPC80
MNXR97773 PHAPC80
MNXR97796 DSCLCOCH
MNXR97796 SHCHCC
MNXR97836 D_LACDHm
MNXR97836 D_LACDm
MNXR97883 ECOAH1
MNXR97883 ECOAH1_1
MNXR97890 ECOAH5
MNXR97890 ECOAH5_1
MNXR97933 ENTCS
MNXR97933 ENTCS_1
MNXR98064 EX_4hoxpac_e
MNXR98064 EX_4hphac_e
MNXR98131 EX_CE5868_e
MNXR98131 EX_asp__L_e
MNXR98137 EX_HC00822_e
MNXR98137 EX_chitob_e
MNXR98152 EX_HC02172_e
MNXR98152 EX_zn2_e
MNXR98195 EX_abt__L_e
MNXR98195 EX_abt_e
MNXR98307 EX_buts_e
MNXR98307 EX_butso3_e
MNXR98460 EX_eths_e
MNXR98460 EX_ethso3_e
MNXR98539 EX_galct__D_e
MNXR98539 EX_galctr__D_e
MNXR98574 EX_glcn__D_e
MNXR98574 EX_glcn_e
MNXR98639 EX_h2co3_e
MNXR98639 EX_hco3_e
MNXR98641 EX_h2o_e
MNXR98641 EX_oh1_e
MNXR98684 EX_isetac_e
MNXR98684 EX_istnt_e
MNXR98715 EX_lipoate_e
MNXR98715 EX_lipt_e
MNXR98755 EX_metox_e
MNXR98755 EX_metsox_R__L_e
MNXR98755 EX_metsox_S__L_e
MNXR98813 EX_orn__L_e
MNXR98813 EX_orn_e
MNXR98950 EX_sula_e
MNXR98950 EX_sulfac_e
MNXR99043 EX_xolest2_cho_e
MNXR99043 EX_xolest_cho_e
MNXR99275 FAOXC121_3Em
MNXR99275 FAOXC121_3Zm
MNXR99284 FAOXC141_5Em
MNXR99284 FAOXC141_5Zm
MNXR99284 FAOXC141_7Em
MNXR99303 FAOXC161_7Em
MNXR99303 FAOXC161_7Zm
MNXR99303 FAOXC161_9Em
MNXR99466 FBP26
MNXR99466 FBPPH
MNXR99468 FCI
MNXR99468 r0598
MNXR99471 FCLT
MNXR99471 FCLT_2
MNXR99473 FCOAH
MNXR99473 FCOAH2
MNXR99505 FE2utm
MNXR99505 FEtm
MNXR99593 FGLU
MNXR99593 FGLU_1
MNXR99601 FLVR
MNXR99601 FLVR_1
MNXR99604 FMETTRS
MNXR99604 FMETTRS_1
MNXR99614 FOMETRi
MNXR99614 THFAT
MNXR99614 MTAM
MNXR99614 THFATm
MNXR99615 FORA
MNXR99615 FORAMD
MNXR99636 FRD
MNXR99636 SUCD1
MNXR99636 FRDm
MNXR99636 SUCD1m
MNXR99641 FRD7
MNXR99641 SUCDi
MNXR99667 FTAL
MNXR99667 FTCD
MNXR99668 FTCL
MNXR99668 FTHFCLm
MNXR99671 FTHFI
MNXR99671 FTHFI_1
MNXR99905 G6PDA
MNXR99905 G6PDA_1
MNXR99910 G6PI2
MNXR99910 G6PI3
MNXR99955 GALCTD
MNXR99955 GALCTD_1
MNXR99962 GALCTRt2
MNXR99962 GALCTt2r
zakandrewking commented 6 years ago

Thanks for collecting these here! We will definitely go through them for the next release.

djinnome commented 6 years ago

@ChristianLieven, @Midnighter, the code that generated this list would also make a really nice unit test for memote

Midnighter commented 6 years ago

Agreed, we just talked about this today!

willigott commented 6 years ago

@zakandrewking: Thanks for looking into this. I guess the same should be done for the metabolites (see #274).

If you decide to merge entries, how will this be documented? 1) Will the IDs be stored in "old_bigg_ids"? 2) If you update models then accordingly, would it be possible to start a versioning of the models so that changes can easily be tracked?

zakandrewking commented 6 years ago

When we merge things, we do store previous ids in old_bigg_ids

We are now building a web application from the ground up for model reconstruction, so it will solve many of the open challenges with bigg (collaboration, versioning, identifiers, etc.)

zakandrewking commented 6 years ago

BTW, @willigott, thanks for all of the issues you opened. We do plan to make all of these fixes, either in BiGG Models itself, or in its successor.

ChristianLieven commented 6 years ago

@willigott would you mind sharing the code you used to generate this list with us over at https://github.com/opencobra/memote. We'd like to implement this as a test in our tool. Of course you're also more than welcome to implement it yourself and open a PR!

https://github.com/opencobra/memote/issues/240

willigott commented 6 years ago

@ChristianLieven: Sure, no problem. As this was just a very quick check, the code is not generic enough i.e. not of sufficient quality for a PR and I would also not know where and how exactly this should be incorporated, so I just leave it here; feel free to use/adapt whatever you consider as useful (I never gave it a try but maybe one could use DD-DeCaF's id-mapper to do this). I guess something similar should be done for metabolites, too (see #274) . Please also keep in mind that this only works for annotated entries; for example for http://bigg.ucsd.edu/universal/metabolites/C02712 and http://bigg.ucsd.edu/universal/metabolites/acmet one could not do it as the latter does not have any annotation; not sure what the best way is to filter those (same could apply to reactions).

import pandas as pd
import re

"""
This file determines potential duplicate reactions in bigg. Reactions are considered as duplicates if:
i) they have the same MetanetX ID
ii) they show the same compartment information (which is retrieved from the metabolite IDs)
"""

def retrieve_compartment_info(rea_string):

    """
    given a reaction string, it will return compartment information based on the metabolite IDs' suffix

    :param rea_string: reaction string
    :return: compartment string

    Example:
    retrieve_compartment_info('atp_c + coa_c + ppa_c <-> adp_c + pi_c + ppcoa_c') returns 'c'
    retrieve_compartment_info('acgam_e <-> acgam_p') returns 'ep'
    """

    # extract all compounds from the reaction string
    all_compounds = re.findall(r'\w+', rea_string)

    # get sorted set of compartments
    all_compartments = sorted(list(set(mi[-1] for mi in all_compounds if len(mi.split('_')) > 1)))

    return ''.join(all_compartments)

# reaction information from bigg; downloaded from http://bigg.ucsd.edu/data_access
rea_bigg = pd.read_csv('bigg_models_reactions.txt', sep='\t')

# get the metanetx ID for a given bigg reaction
metnetx_pat = 'MetaNetX \(MNX\) Equation\: http\:\/\/identifiers\.org\/metanetx\.reaction\/(MNXR\d+)'
rea_bigg['metanetx_id'] = rea_bigg.loc[:, 'database_links'].str.extract(metnetx_pat, expand=False)

# for now, we are only interested in reactions with a metanetx ID
rea_bigg = rea_bigg.dropna(subset=['metanetx_id'])

# get the compartment info for a given bigg reaction
rea_bigg['comp_info_bigg'] = rea_bigg['reaction_string'].apply(lambda x: retrieve_compartment_info(x))

# for now we only check whether metanetx id and compartment are identical
rea_bigg = rea_bigg[['metanetx_id', 'comp_info_bigg', 'bigg_id']]

# group by metanetx id and compartment and identify groups with more than one member
dupls = rea_bigg.groupby(['metanetx_id', 'comp_info_bigg'])['metanetx_id'].transform('size') > 1

# select all potential duplicate reactions
rea_bigg_identical = rea_bigg[dupls].sort_values(['metanetx_id', 'comp_info_bigg', 'bigg_id']).reset_index(drop=True)

rea_bigg_identical.drop('comp_info_bigg', axis=1).to_csv('potential_identical_reactions_bigg.csv', index=False)
ChristianLieven commented 6 years ago

Thanks @willigott! You can follow the progress here: https://github.com/opencobra/memote/issues/240