SysBioChalmers / Human-GEM

The generic genome-scale metabolic model of Homo sapiens
https://sysbiochalmers.github.io/Human-GEM-guide/
Creative Commons Attribution 4.0 International
96 stars 40 forks source link

Remove Duplicate V-Type ATPase Reaction #829

Open Devlin-Moyer opened 3 months ago

Devlin-Moyer commented 3 months ago

Main improvements in this PR:

As discussed in #348, MAR07799 and MAR00080 both appear to represent the activity of a V-type ATPase and involve the same metabolites with slightly different stoichiometries and GPRs. The GPR of MAR00080 appears to represent the relationship that the genes encoding different subunits of the V-type ATPase relate to each other more accurately than the GPR of MAR07799, so this pull request removes MAR07799 and merges its annotations in reactions.tsv with those of MAR00080.

Since MAR07799 was the only reaction that ENSG00000071553 (ATP6AP1) was associated with, and this paper claims "missense disease mutations in ATP6AP1 cause reduced V-ATPase function by affecting its folding and assembly", and shows it bound to the V0 subunits, which are grouped in the first part of the GPR for MAR00080, this PR also changes the GPR of MAR00080 to: ( (ATP6AP1 and ATP6V0A4 and ATP6V0E1 and ATP6V0B and ATP6V0D2 and ATP6V0C) or (ATP6AP1 and ATP6V0E1 and ATP6V0B and ATP6V0D2 and ATP6V0A2 and ATP6V0C) or (ATP6AP1 and ATP6V0A4 and ATP6V0E1 and ATP6V0B and ATP6V0D1 and ATP6V0C) or (ATP6AP1 and ATP6V0A1 and ATP6V0E1 and ATP6V0B and ATP6V0D2 and ATP6V0C) or (ATP6AP1 and ATP6V0A1 and ATP6V0E1 and ATP6V0B and ATP6V0D1 and ATP6V0C) or (TCIRG1 and ATP6AP1 and ATP6V0E1 and ATP6V0B and ATP6V0D1 and ATP6V0C) or (ATP6AP1 and ATP6V0E1 and ATP6V0B and ATP6V0D1 and ATP6V0A2 and ATP6V0C) or (TCIRG1 and ATP6AP1 and ATP6V0E1 and ATP6V0B and ATP6V0D2 and ATP6V0C) ) and ( (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1G1 and ATP6V1B2 and ATP6V1C1 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1G1 and ATP6V1C1 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1G1 and ATP6V1C2 and ATP6V1B2 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1E1 and ATP6V1G1 and ATP6V1B2 and ATP6V1C1) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1E1 and ATP6V1G1 and ATP6V1C1) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1B2 and ATP6V1C1 and ATP6V1G2 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1C2 and ATP6V1B2 and ATP6V1G2 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1E1 and ATP6V1B2 and ATP6V1C1 and ATP6V1G2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1E1 and ATP6V1G3 and ATP6V1C1) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1C2 and ATP6V1B2 and ATP6V1G3 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1E1 and ATP6V1C2 and ATP6V1G2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1C2 and ATP6V1G2 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1E1 and ATP6V1C1 and ATP6V1G2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1B2 and ATP6V1G3 and ATP6V1C1 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1E1 and ATP6V1B2 and ATP6V1G3 and ATP6V1C1) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1C1 and ATP6V1G2 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1G3 and ATP6V1C1 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1E1 and ATP6V1C2 and ATP6V1B2 and ATP6V1G3) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1E1 and ATP6V1C2 and ATP6V1B2 and ATP6V1G2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1E1 and ATP6V1C2 and ATP6V1G3) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1C2 and ATP6V1G3 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1G1 and ATP6V1C2 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1E1 and ATP6V1G1 and ATP6V1C2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1E1 and ATP6V1G1 and ATP6V1C2 and ATP6V1B2) ) and adds PMID:33065002 as a reference for MAR00080

I hereby confirm that I have:

Devlin-Moyer commented 3 months ago

ah the YAML conversion & validation tests are failing because apparently MAR07799 was the only reaction that ENSG00000071553 (ATP6AP1) was associated with

Devlin-Moyer commented 3 months ago

According to this paper, which seems to have been the first to characterize ATP6AP1's role in the assembly of the V-type ATPase complex, "missense disease mutations in ATP6AP1 cause reduced V-ATPase function by affecting its folding and assembly." So it should probably be added to the GPR of MAR00080.

The GPR of MAR00080 currently looks like this: ( (ATP6V0A4 and ATP6V0E1 and ATP6V0B and ATP6V0D2 and ATP6V0C) or (ATP6V0E1 and ATP6V0B and ATP6V0D2 and ATP6V0A2 and ATP6V0C) or (ATP6V0A4 and ATP6V0E1 and ATP6V0B and ATP6V0D1 and ATP6V0C) or (ATP6V0A1 and ATP6V0E1 and ATP6V0B and ATP6V0D2 and ATP6V0C) or (ATP6V0A1 and ATP6V0E1 and ATP6V0B and ATP6V0D1 and ATP6V0C) or (TCIRG1 and ATP6V0E1 and ATP6V0B and ATP6V0D1 and ATP6V0C) or (ATP6V0E1 and ATP6V0B and ATP6V0D1 and ATP6V0A2 and ATP6V0C) or (TCIRG1 and ATP6V0E1 and ATP6V0B and ATP6V0D2 and ATP6V0C) ) and ( (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1G1 and ATP6V1B2 and ATP6V1C1 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1G1 and ATP6V1C1 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1G1 and ATP6V1C2 and ATP6V1B2 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1E1 and ATP6V1G1 and ATP6V1B2 and ATP6V1C1) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1E1 and ATP6V1G1 and ATP6V1C1) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1B2 and ATP6V1C1 and ATP6V1G2 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1C2 and ATP6V1B2 and ATP6V1G2 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1E1 and ATP6V1B2 and ATP6V1C1 and ATP6V1G2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1E1 and ATP6V1G3 and ATP6V1C1) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1C2 and ATP6V1B2 and ATP6V1G3 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1E1 and ATP6V1C2 and ATP6V1G2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1C2 and ATP6V1G2 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1E1 and ATP6V1C1 and ATP6V1G2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1B2 and ATP6V1G3 and ATP6V1C1 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1E1 and ATP6V1B2 and ATP6V1G3 and ATP6V1C1) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1C1 and ATP6V1G2 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1G3 and ATP6V1C1 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1E1 and ATP6V1C2 and ATP6V1B2 and ATP6V1G3) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1E1 and ATP6V1C2 and ATP6V1B2 and ATP6V1G2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1E1 and ATP6V1C2 and ATP6V1G3) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1C2 and ATP6V1G3 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1G1 and ATP6V1C2 and ATP6V1E2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1B1 and ATP6V1F and ATP6V1E1 and ATP6V1G1 and ATP6V1C2) or (ATP6V1H and ATP6V1D and ATP6V1A and ATP6V1F and ATP6V1E1 and ATP6V1G1 and ATP6V1C2 and ATP6V1B2) )

So it looks like all the genes encoding the subunits of the V0 complex are grouped together in the first bit and the genes encoding the subunits of the V1 complex are in the second bit. The graphical abstract of the paper I linked earlier shows ATP6AP1 attached to the bottom of the V0 complex, so I'm just gonna add "and ATP6AP1" to each of those groups

github-actions[bot] commented 1 day ago

This PR has been automatically tested with GH Actions. Here is the output of the macaw test:

Starting dead-end test...
- Found 1373 dead-end metabolites.
- Found 1129 reactions incapable of sustaining steady-state fluxes in either direction due to these dead-ends.
- Found 2077 reversible reactions that can only carry steady-state fluxes in a single direction due to dead-ends.
Starting duplicate test...
- Skipping redox duplicates because no redox_pairs and/or proton_ids were provided.
- Found 447 reactions that were some type of duplicate:
- 0 were completely identical to at least one other reaction.
- 13 involve the same metabolites but go in the opposite direction or have the opposite reversibility as at least one other reaction.
- 447 involve the same metabolites but with different coefficients as at least one other reaction.

A more detailed output from this test run is also committed to data/macawResults/macaw_results.csv.

Note: In the case of multiple test runs, this post will be edited.