SysBioChalmers / Sco-GEM

The consensus GEM for Streptomyces coelicolor -
https://sysbiochalmers.github.io/Sco-GEM/
Creative Commons Attribution 4.0 International
3 stars 7 forks source link

fix: complex grRules #51

Closed edkerk closed 5 years ago

edkerk commented 5 years ago

Please provide relevant information below:

Description of the issue:

There are various reactions with complex GPR relationships (nested 'and' and 'or' relationships), which increase the risk of mistakes. For instance, Gecko uses GPR rules to separate isoenzymes, such that correct GPR rules are crucial.

Expected feature/value/output:

If a current relationship is potentially written as:

ideally, all possible combinations should be written as:

Previously, the following RAVEN code was used to fix some of the reactions in Sco4:

%% fix-rxns: correct grRules
% Revert to iMK1208 provided grRules, while standardizing 'and'
% relationships.
model = changeGeneAssoc(model, 'OXDHCOAT', '(SCO5144 and SCO6701) or (SCO5144 and SCO6967) or (SCO5144 and SCO3079) or (SCO5144 and SCO6731)');
model = changeGeneAssoc(model, 'IBMi', '(SCO5415 and SCO4800) or (SCO5415 and SCO6833)');
model = changeGeneAssoc(model, 'AKGDH', '(SCO5281 and SCO2181 and SCO0884) or (SCO5281 and SCO2181 and SCO2180) or (SCO5281 and SCO2181 and SCO4919) or (SCO5281 and SCO7123 and SCO0884) or (SCO5281 and SCO7123 and SCO2180) or (SCO5281 and SCO7123 and SCO4919)');
model = changeGeneAssoc(model, 'AKGDH2', '(SCO4594 and SCO4595 and SCO0681) or (SCO6269 and SCO6270 and SCO0681)');
model = changeGeneAssoc(model, 'PAPSR', '(SCO6100 and SCO0885) or (SCO6100 and SCO3889) or (SCO6100 and SCO5419) or (SCO6100 and SCO5438)');
model = changeGeneAssoc(model, 'GLYCL', '(SCO5471 and SCO1378 and SCO5472 and SCO0884) or (SCO5471 and SCO1378 and SCO5472 and SCO2180) or (SCO5471 and SCO1378 and SCO5472 and SCO4919)');
model = changeGeneAssoc(model, '2MBCOATA', '(SCO1271 and SCO2389) or (SCO1271 and SCO0549) or (SCO1271 and SCO1267) or (SCO1271 and SCO1272) or (SCO2388 and SCO2389) or (SCO2388 and SCO0549) or (SCO2388 and SCO1267) or (SCO2388 and SCO1272) or (SCO6564 and SCO2389) or (SCO6564 and SCO0549) or (SCO6564 and SCO1267) or (SCO6564 and SCO1272)');
model = changeGeneAssoc(model, 'ACCOAC', '(SCO2445 and SCO2777) or (SCO2445 and SCO4921) or (SCO2445 and SCO6271) or (SCO5535 and SCO5536 and SCO2777) or (SCO5535 and SCO5536 and SCO4921) or (SCO5535 and SCO5536 and SCO6271)');
model = changeGeneAssoc(model, {'ACOATA', 'BCOATA', 'IBCOATA', 'IVCOATA', 'PCOATA'}, '(SCO1271 and SCO2389) or (SCO1271 and SCO0549) or (SCO1271 and SCO1267) or (SCO1271 and SCO1272) or (SCO2388 and SCO2389) or (SCO2388 and SCO0549) or (SCO2388 and SCO1267) or (SCO2388 and SCO1272) or (SCO6564 and SCO2389) or (SCO6564 and SCO0549) or (SCO6564 and SCO1267) or (SCO6564 and SCO1272)');
model = changeGeneAssoc(model, 'MCOATA', '(SCO2387 and SCO2389) or (SCO2387 and SCO0549) or (SCO2387 and SCO1267) or (SCO2387 and SCO1272)');
model = changeGeneAssoc(model, 'METSOXR1', '(SCO4956 and SCO0885) or (SCO4956 and SCO3889) or (SCO4956 and SCO5419) or (SCO4956 and SCO5438)');
model = changeGeneAssoc(model, 'METSOXR2', '(SCO6061 and SCO0885) or (SCO6061 and SCO3889) or (SCO6061 and SCO5419) or (SCO6061 and SCO5438)');
model = changeGeneAssoc(model, {'MPTG','MPTG2'}, '(SCO3847 and SCO2709) or (SCO3847 and SCO3894) or (SCO5301 and SCO2709) or (SCO5301 and SCO3894)');
model = changeGeneAssoc(model, {'RNDR1', 'RNDR2', 'RNDR3', 'RNDR4'},'(SCO5225 and SCO5226 and SCO0885) or (SCO5225 and SCO5226 and SCO3889) or (SCO5225 and SCO5226 and SCO5419) or (SCO5225 and SCO5226 and SCO5438) or (SCO5805 and SCO0885) or (SCO5805 and SCO3889) or (SCO5805 and SCO5419) or (SCO5805 and SCO5438)');
model = changeGeneAssoc(model, 'CYO2a', '(SCO2150 and SCO2149 and SCO7236) or (SCO2150 and SCO2149 and SCO2148) or (SCO2150 and SCO2149 and SCO7120)');
model = changeGeneAssoc(model, 'CYO2b', '(SCO1934 and SCO2156 and SCO2151 and SCO1930 and SCO7234) or (SCO1934 and SCO2156 and SCO2151 and SCO1930 and SCO2155)');
model = changeGeneAssoc(model, 'SUCD3', '(SCO4856 and SCO4855 and SCO4858 and SCO4857) or (SCO4856 and SCO5106 and SCO4858 and SCO4857) or (SCO5107 and SCO4855 and SCO4858 and SCO4857) or (SCO5107 and SCO5106 and SCO4858 and SCO4857) or (SCO7109 and SCO4855 and SCO4858 and SCO4857) or (SCO7109 and SCO5106 and SCO4858 and SCO4857)');
model = changeGeneAssoc(model, 'TRDR', '(SCO3890 and SCO0885) or (SCO3890 and SCO3889) or (SCO3890 and SCO5419) or (SCO3890 and SCO5438) or (SCO6834 and SCO0885) or (SCO6834 and SCO3889) or (SCO6834 and SCO5419) or (SCO6834 and SCO5438) or (SCO7298 and SCO0885) or (SCO7298 and SCO3889) or (SCO7298 and SCO5419) or (SCO7298 and SCO5438)');
model = changeGeneAssoc(model, 'PPCOAC', '(SCO2776 and SCO2777) or (SCO4380 and SCO4381) or (SCO4921 and SCO4925) or (SCO4921 and SCO4926) or (SCO6271 and SCO4925) or (SCO6271 and SCO4926)');
model = changeGeneAssoc(model, '2OXOADOX', '(SCO5281 and SCO2181 and SCO0884) or (SCO5281 and SCO2181 and SCO2180) or (SCO5281 and SCO2181 and SCO4919) or (SCO5281 and SCO7123 and SCO0884) or (SCO5281 and SCO7123 and SCO2180) or (SCO5281 and SCO7123 and SCO4919)');
model = changeGeneAssoc(model, 'THIORDXi', '(SCO2901 and SCO0885) or (SCO7353 and SCO0885) or (SCO2901 and SCO3889) or (SCO7353 and SCO3889) or (SCO2901 and SCO5419) or (SCO7353 and SCO5419) or (SCO2901 and SCO5438) or (SCO7353 and SCO5438) or SCO4444');
model = changeGeneAssoc(model, {'OIVD1', 'OIVD2', 'OIVD3'}, '(SCO3816 and SCO3817 and SCO3815 and SCO0884) or (SCO3816 and SCO3817 and SCO3815 and SCO2180) or (SCO3816 and SCO3817 and SCO3815 and SCO4919) or (SCO3816 and SCO3817 and SCO3829 and SCO0884) or (SCO3816 and SCO3817 and SCO3829 and SCO2180) or (SCO3816 and SCO3817 and SCO3829 and SCO4919) or (SCO3830 and SCO3831 and SCO3815 and SCO0884) or (SCO3830 and SCO3831 and SCO3815 and SCO2180) or (SCO3830 and SCO3831 and SCO3815 and SCO4919) or (SCO3830 and SCO3831 and SCO3829 and SCO0884) or (SCO3830 and SCO3831 and SCO3829 and SCO2180) or (SCO3830 and SCO3831 and SCO3829 and SCO4919)');

% Redefine, subunits were missing
model = changeGeneAssoc(model, 'PDH', '(SCO1269 and SCO1270 and SCO2183 and SCO0884 and SCO1268) or (SCO1269 and SCO1270 and SCO2183 and SCO0884 and SCO7123) or (SCO1269 and SCO1270 and SCO2183 and SCO0884 and SCO2181) or (SCO1269 and SCO1270 and SCO2183 and SCO2180 and SCO1268) or (SCO1269 and SCO1270 and SCO2183 and SCO2180 and SCO7123) or (SCO1269 and SCO1270 and SCO2183 and SCO2180 and SCO2181) or (SCO1269 and SCO1270 and SCO2183 and SCO4919 and SCO1268) or (SCO1269 and SCO1270 and SCO2183 and SCO4919 and SCO7123) or (SCO1269 and SCO1270 and SCO2183 and SCO4919 and SCO2181) or (SCO1269 and SCO1270 and SCO2371 and SCO0884 and SCO1268) or (SCO1269 and SCO1270 and SCO2371 and SCO0884 and SCO7123) or (SCO1269 and SCO1270 and SCO2371 and SCO0884 and SCO2181) or (SCO1269 and SCO1270 and SCO2371 and SCO2180 and SCO1268) or (SCO1269 and SCO1270 and SCO2371 and SCO2180 and SCO7123) or (SCO1269 and SCO1270 and SCO2371 and SCO2180 and SCO2181) or (SCO1269 and SCO1270 and SCO2371 and SCO4919 and SCO1268) or (SCO1269 and SCO1270 and SCO2371 and SCO4919 and SCO7123) or (SCO1269 and SCO1270 and SCO2371 and SCO4919 and SCO2181) or (SCO1269 and SCO1270 and SCO7124 and SCO0884 and SCO1268) or (SCO1269 and SCO1270 and SCO7124 and SCO0884 and SCO7123) or (SCO1269 and SCO1270 and SCO7124 and SCO0884 and SCO2181) or (SCO1269 and SCO1270 and SCO7124 and SCO2180 and SCO1268) or (SCO1269 and SCO1270 and SCO7124 and SCO2180 and SCO7123) or (SCO1269 and SCO1270 and SCO7124 and SCO2180 and SCO2181) or (SCO1269 and SCO1270 and SCO7124 and SCO4919 and SCO1268) or (SCO1269 and SCO1270 and SCO7124 and SCO4919 and SCO7123) or (SCO1269 and SCO1270 and SCO7124 and SCO4919 and SCO2181)');

% Not standardized, due to too many combinations
model = changeGeneAssoc(model, 'NADH17', 'SCO4564 and SCO4566 and SCO4568 and (SCO4562 or SCO4599) and (SCO4563 or SCO4600) and (SCO3392 or SCO4565) and (SCO4567 or SCO6560) and (SCO4569 or SCO4602) and (SCO4570 or SCO4603) and (SCO4571 or SCO4604) and (SCO4572 or SCO4605) and (SCO4573 or SCO4606 or SCO6954) and (SCO4574 or SCO4607) and (SCO4575 or SCO4608)');

Reproducing these results:

In RAVEN, problematic grRules (=GPR) can be identified using standardizeGrRules

>> standardizeGrRules(model);
Warning: Potentially problematic ") AND (", ") AND" or "AND (" relationships found in

  - grRule #2MBCOATA: ((SCO1271 or SCO2388 or SCO6564) and (SCO2389 or SCO0549 or SCO1267 or SCO1272))
  - grRule #2OXOADOX: (SCO5281 and (SCO2181 or SCO7123 or SCO1268) and (SCO0884 or SCO2180 or SCO4919))
  - grRule #ACCOAC: ((SCO2445 or (SCO5535 and SCO5536)) and (SCO2777 or SCO4921 or SCO6271))
  - grRule #ACCOAC_1: ((SCO6271 or SCO4921) and SCO2445 and SCO2777)
  - grRule #ACHBS: ((SCO5512 or SCO2769 or SCO6584) and SCO5513)
  - grRule #ACLS: ((SCO5512 or SCO2769 or SCO6584) and SCO5513)
  - grRule #ACOATA: ((SCO1271 or SCO2388 or SCO6564) and (SCO2389 or SCO0549 or SCO1267 or SCO1272))
  - grRule #AKGDH: (SCO5281 and (SCO2181 or SCO7123 or SCO1268) and (SCO0884 or SCO2180 or SCO4919))
  - grRule #AKGDH2: (((SCO4594 and SCO4595) or (SCO6269 and SCO6270)) and SCO0681)
  - grRule #ATNS_nh4: (SCO3213 and (SCO3214 or SCO2043))
  - grRule #BCOATA: ((SCO1271 or SCO2388 or SCO6564) and (SCO2389 or SCO0549 or SCO1267 or SCO1272))
  - grRule #CDAS2: ((SCO3246 or SCO1271 or SCO2388 or SCO6564 or SCO6826 or SCP1233B) and SCO3249)
  - grRule #CU1O: ((SCO3439 and SCO3440) and SCO6712)
  - grRule #CYO1ab: (SCO2150 and SCO2149 and (SCO7236 or SCO2148 or SCO7120))
  - grRule #CYO2b: (SCO1934 and SCO2156 and (SCO7234 or SCO2155) and SCO2151 and SCO1930 and SCO2154)
  - grRule #GLUSx: ((SCO1977 or SCO2025) and SCO2026)
  - grRule #GLUSy: ((SCO1977 or SCO2025) and SCO2026)
  - grRule #GLYCL: (SCO5471 and SCO1378 and SCO5472 and (SCO0884 or SCO2180 or SCO4919))
  - grRule #IBCOATA: ((SCO1271 or SCO2388 or SCO6564) and (SCO2389 or SCO0549 or SCO1267 or SCO1272))
  - grRule #IBTMr: (SCO5415 and (SCO4800 or SCO6833))
  - grRule #IVCOATA: ((SCO1271 or SCO2388 or SCO6564) and (SCO2389 or SCO0549 or SCO1267 or SCO1272))
  - grRule #KAS15: ((SCO1271 or SCO2388 or SCO6564 or SCO3246 or SCO6826 or SCP1233B) and (SCO2389 or SCO0549 or SCO1267 or SCO1272))
  - grRule #MCOATA: (SCO2387 and (SCO2389 or SCO0549 or SCO1267 or SCO1272))
  - grRule #METSOXR1: (SCO4956 and (SCO0885 or SCO3889 or SCO5419 or SCO5438))
  - grRule #METSOXR2: (SCO6061 and (SCO0885 or SCO3889 or SCO5419 or SCO5438))
  - grRule #MMM: ((SCO4800 or SCO6833) and (SCO4869 or SCO6832 or SCO4515))
  - grRule #MPTG: ((SCO3847 or SCO5301) and (SCO2709 or SCO3894))
  - grRule #MPTG2: ((SCO3847 or SCO5301) and (SCO2709 or SCO3894))
  - grRule #OIVD1r: (((SCO3816 and SCO3817) or (SCO3830 and SCO3831)) and (SCO3815 or SCO3829) and (SCO0884 or SCO2180 or SCO4919))
  - grRule #OIVD2: (((SCO3816 and SCO3817) or (SCO3830 and SCO3831)) and (SCO3815 or SCO3829) and (SCO0884 or SCO2180 or SCO4919))
  - grRule #OIVD3: (((SCO3816 and SCO3817) or (SCO3830 and SCO3831)) and (SCO3815 or SCO3829) and (SCO0884 or SCO2180 or SCO4919))
  - grRule #OXDHCOAT: (SCO5144 and (SCO6701 or SCO6967 or SCO3079 or SCO6731))
  - grRule #PAPSR: (SCO6100 and (SCO0885 or SCO3889 or SCO5419 or SCO5438))
  - grRule #PCOATA: ((SCO1271 or SCO2388 or SCO6564) and (SCO2389 or SCO0549 or SCO1267 or SCO1272))
  - grRule #PDH: (SCO1269 and SCO1270 and (SCO2183 or SCO2371 or SCO7124) and (SCO0884 or SCO04919 or SCO2180) and (SCO1268 or SCO7123 or
  SCO2181) and (SCO3815 or SCO3829))
  - grRule #PPCOAC: ((SCO2776 and SCO2777) or (SCO4380 and SCO4381) or ((SCO4921 or SCO6271) and (SCO4925 and SCO4926)))
  - grRule #RNDR1: (((SCO5225 and SCO5226) or SCO5805) and (SCO0885 or SCO3889 or SCO5419 or SCO5438))
  - grRule #RNDR2: (((SCO5225 and SCO5226) or SCO5805) and (SCO0885 or SCO3889 or SCO5419 or SCO5438))
  - grRule #RNDR3: (((SCO5225 and SCO5226) or SCO5805) and (SCO0885 or SCO3889 or SCO5419 or SCO5438))
  - grRule #RNDR4: (((SCO5225 and SCO5226) or SCO5805) and (SCO0885 or SCO3889 or SCO5419 or SCO5438))
  - grRule #SUCD9: ((SCO0922 or SCO4855 or SCO5106) and (SCO0923 or SCO4856 or SCO5107 or SCO7109))
  - grRule #THIORDXi: (((SCO2901 or SCO7353) and (SCO0885 or SCO3889 or SCO5419 or SCO5438)) or SCO4444)
  - grRule #TRDR: ((SCO3890 or SCO6834 or SCO7298) and (SCO0885 or SCO3889 or SCO5419 or SCO5438))
  - grRule #CYO2a: (SCO2150 and SCO2149 and (SCO7236 or SCO2148 or SCO7120))
  - grRule #NADH17b: ((SCO4562 or SCO4599) and (SCO4563 or SCO4600) and SCO4564 and (SCO3392 or SCO4565) and SCO4566 and (SCO4567 or
  SCO6560) and SCO4568 and (SCO4569 or SCO4602) and (SCO4570 or SCO4603) and (SCO4571 or SCO4604) and (SCO4572 or SCO4605) and (SCO4573 or
  SCO4606 or SCO6954) and (SCO4574 or SCO4607) and (SCO4575 or SCO4608 or SCO6956))
  - grRule #NADH8: ((SCO4562 or SCO4599) and (SCO4563 or SCO4600) and SCO4564 and (SCO3392 or SCO4565) and SCO4566 and (SCO4567 or
  SCO6560) and SCO4568 and (SCO4569 or SCO4602) and (SCO4570 or SCO4603) and (SCO4571 or SCO4604) and (SCO4572 or SCO4605) and (SCO4573 or
  SCO4606 or SCO6954) and (SCO4574 or SCO4607) and (SCO4575 or SCO4608 or SCO6956))
  - grRule #SUCD3: ((SCO4856 or SCO5107 or SCO7109) and (SCO4855 or SCO5106) and SCO4858 and SCO4857)

To do:

I hereby confirm that:

sulheim commented 5 years ago

Ok, the drawback with this formalism is that the readability is much better with the current form. Is it an option to improve the GECKO-parser?

edkerk commented 5 years ago

This is model specific, and certain GPRs should be manually inspected to ensure that they make sense. Regardless, I'll include swapping the GPR rules in the ecScoGEM script.

edkerk commented 5 years ago

This is included in the preprocessModel script in the GECKO submodule.