DD-DeCaF / simulations

Model service which takes care of adjusting model according to incoming messages and returns information such as fluxes, theoretical maximum yields, etc
Apache License 2.0
0 stars 1 forks source link

Remove duplicate compounds to speed up metabolite lookup #126

Closed kvikshaug closed 5 years ago

kvikshaug commented 5 years ago

I realize now that this was a valid concern you raised. This helps a bit (because there are often many duplicate ions/salts applied), however it still takes >1s to apply 77 metabolites.

kvikshaug commented 5 years ago

You'd have to include the namespace to guarantee uniqueness, and the datastructure will be unordered and a bit redundant (e.g. {"chebi:CHEBI:12345": {"id": "CHEBI:12345", "namespace": "chebi"}}), but that will indeed be a tiny bit faster.

Without removing duplicates: 2.95s Removing duplicates with json: 0.00104s + 1.55s Removing duplicates with dict: 0.000028s + 1.48s

The ideal case would be to make a set of hashable chemical class instances to avoid dupes, but that seems a bit overkill here, so I'll go with the dict solution.

Midnighter commented 5 years ago

The ideal case would be to make a set of hashable chemical class instances to avoid dupes, but that seems a bit overkill here, so I'll go with the dict solution.

from collections import namedtuple

Compound = namedtuple("Compound", ["id", "namespace"])

{
    Compound(id="CHEBI:234324", namespace="chebi"): 1
}

maybe?

kvikshaug commented 5 years ago

Nice suggestion :+1: then we can just use a set too, no need for a dict.

codecov-io commented 5 years ago

Codecov Report

Merging #126 into devel will increase coverage by 0.11%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           devel     #126      +/-   ##
=========================================
+ Coverage     72%   72.11%   +0.11%     
=========================================
  Files         20       20              
  Lines        743      746       +3     
=========================================
+ Hits         535      538       +3     
  Misses       208      208
Impacted Files Coverage Δ
src/model/modeling/adapter.py 68.35% <100%> (+0.61%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update f163a72...dc6e9ba. Read the comment docs.