cdanielmachado / smetana

SMETANA: a tool to analyse interactions in microbial communities
Other
56 stars 11 forks source link

Changes in result for -d and -g #20

Open ntromas opened 3 years ago

ntromas commented 3 years ago

Hi Daniel,

I am using SMETANA with a small community (models from the last version of carveme, without gapfilling). My output always change for -d which I think is linked to the use of Cplex? What would you suggest to avoid or take into account this variation (e.g different number of interactions)?

I also got different results with -g (using the same input). I tried --molweight just in case but got this error.

cat debug.tsv community medium key1 key2 data all complete mip ni 4abz,LalaDgluMdap,acmana,alaala,amp,argL,bz,ca2,cgly,cl,cobalt2,cu2,cytd,fe3,frmd,glnL,gly_asnL,glyb,glyc3p,homL,ileL,k,lysL,mg2,mn2,nmn,o2,pheL,pntoR,proL,quin,ribflv,salchs4fe,so4,thm,tol,tyrL,uaccg,udcpp,valL,zn2 all complete mip i 4abz,LalaDgluMdap,acmana,alaala,amp,argL,bz,ca2,cgly,cl,cobalt2,cu2,cytd,fe3,frmd,glnL,gly_asnL,glyc3p,ileL,k,lysL,mg2,mn2,nmn,o2,pheL,pntoR,proL,quin,ribflv,so4,tol,tyrL,uaccg,udcpp,valL,zn2 all complete mro community LalaDgluMdap,acmana,alaala,amp,argL,bz,ca2,cgly,chol,cl,cobalt2,cu2,cytd,fe3,fol,frmd,glcur,glnL,gly_asnL,glyc3p,ileL,indole,k,lysL,mg2,mn2,nmn,pheL,progly,ribflv,so4,tol,tyrL,uaccg,udcpp,valL,zn2 all complete mro M11_Microcystis LalaDgluMdap,acmana,alaala,amp,argL,bz,ca2,cl,cobalt2,cu2,cytd,fe2,fe3,fol,frmd,glnL,gly_asnL,homL,ileL,indole,k,lysL,mg2,mn2,nmn,o2,pheL,pntoR,progly,ribflv,so4,tyrL,uaccg,udcpp,val__L,zn2 all complete mro M11_Roseomonas argL,ca2,cgly,cl,cobalt2,cu2,cytd,fe3,glcur,glyb,glyc3p,k,lysL,mg2,mn2,nmn,phe__L,so4,thm,tol,zn2

smetana -g M11_Microcystis.xml M11_Roseomonas.xml --molweight /home/nico/miniconda3/lib/python3.8/site-packages/smetana/smetana.py:351: UserWarning: MRO: Failed to find a valid solution for: M11_Roseomonas warn('MRO: Failed to find a valid solution for: ' + org_id)

Thanks for your help!

Nico

cdanielmachado commented 3 years ago

Hi Nico,

Your problems are related to these previously identified issues:

https://github.com/cdanielmachado/smetana/issues/17

https://github.com/cdanielmachado/smetana/issues/10

To be honest, there is not much that can be done without trying to make the formulation a bit more robust, which I won't have time to do in the near future.

One thing that might help (and I would always recommend) is to ignore inorganic compounds:

--exclude inorganic.txt

You can find the file here: https://www.dropbox.com/s/bapo01qf1uef3wm/inorganic.txt?dl=0

ntromas commented 3 years ago

Hi Daniel,

Thanks for the help. Yes I saw the different issue but I was just wondering if there is a way to reduce variations in outputs. I got my answer!

Is it ok to always use --molweight (for -d and -g)?

When -d is used, my guess is that a minimal medium is used for the whole community (if the input is for example 4 different taxa)? Sometimes when I used these 4 taxa I got an empty detailed.tsv but when I removed one of them, detailed is not anymore empty.

Thanks again for your time&explanation,

Cheers

Le mar. 27 avr. 2021 03 h 41, Daniel Machado @.***> a écrit :

Hi Nico,

Your problems are related to these previously identified issues:

17 https://github.com/cdanielmachado/smetana/issues/17

10 https://github.com/cdanielmachado/smetana/issues/10

To be honest, there is not much that can be done without trying to make the formulation a bit more robust, which I won't have time to do in the near future.

One thing that might help (and I would always recommend) is to ignore inorganic compounds:

--exclude inorganic.txt

You can find the file here: https://www.dropbox.com/s/bapo01qf1uef3wm/inorganic.txt?dl=0

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cdanielmachado/smetana/issues/20#issuecomment-827389107, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABY5D6G2B55ZAQSJD4QQZILTKZTA5ANCNFSM43RRLQTQ .

cdanielmachado commented 3 years ago

Hi Nico,

Yes, I would recommend always using --molweight. From my experience, minimizing the total mass of the consumed substrates, rather than simply the total number of substrates, results in more realistic estimates of the growth medium.

Regarding the variation in the outputs, my suggestion is to run simulations a few times (10 or 100 depending on how long they take), and then just compute the average of the scores. And you can also use the variability of the score as an indication of how reliable it is.

ntromas commented 3 years ago

Hi Daniel,

Thanks a lot for the advices!!

Cheers,

Nico

Le jeu. 29 avr. 2021 02 h 55, Daniel Machado @.***> a écrit :

Hi Nico,

Yes, I would recommend always using --molweight. From my experience, minimizing the total mass of the consumed substrates, rather than simply the total number of substrates, results in more realistic estimates of the growth medium.

Regarding the variation in the outputs, my suggestion is to run simulations a few times (10 or 100 depending on how long they take), and then just compute the average of the scores. And you can also use the variability of the score as an indication of how reliable it is.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cdanielmachado/smetana/issues/20#issuecomment-828986145, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABY5D6F3WWOSASK6FRUF76DTLD7GVANCNFSM43RRLQTQ .