CovertLab / wcEcoli

Whole Cell Model of E. coli
Other
18 stars 3 forks source link

Optimizations #1425

Closed thalassemia closed 5 months ago

thalassemia commented 6 months ago

Got a little frustrated with the whole Accelerate rabbithole not yielding any performance gains, so I decided to try profiling our model in hopes of finding other avenues for optimization. Lo and behold, with just 2 small changes (here and here), I was able to cut the runtime for one generation by something like 20%. Adding in the JIT stuff brings the total improvement to about 25% on my machine (from 8m45s to 6m30s).

The little tweak to nf_glpk.py does not improve memory utilization because the unnecessary dense array is garbage collected. I just changed it because it stuck out like a sore thumb when profiling.

thalassemia commented 6 months ago

I've played around with a bunch of profiling tools. pyinstrument and py-spy have been my favorites. memray was cool for memory profiling even if it didn't yield any obvious areas for improvement in our code.

Unfortunately, I don't have access to Sherlock. It would be super helpful if you could update the environment there for me.

I'll rerun the sims, compare with the script you mentioned, and report back.

1fish2 commented 6 months ago

I'll update the staging pyenv on Sherlock Saturday/Sunday then tweak setup-environment.sh.

Then let's coordinate after the PR build. @ggsun do you want to review any of the changes? When everyone's ready I can update the main pyenv, revert the setup-environment.sh tweak, and squash/merge to minimize the delay between steps.

ggsun commented 6 months ago

I'm at a conference at Hawaii, happy to review this after I come back on Wednesday. Amazing time saving here!

1fish2 commented 5 months ago

retest this please

1fish2 commented 5 months ago

After lunch I'll switch the pyenv from staging to wcEcoli, update Sherlock's pyenv wcEcoli, and merge this.

On Sherlock, wcEcoli3-staging has numpy & scipy linked to OpenBLAS v0.3.26. See $GROUP_HOME/installation_notes/openblas.txt.

thalassemia commented 5 months ago

Thanks for the suggestion @ggsun! I'm still trying to figure out why my changes have led to some differences in the sim output. It's kinda shocking to me that the calculated FBA fluxes can be so different without also throwing off the mass, doubling time, etc. I'll update again once I've finished my investigation.

1fish2 commented 5 months ago

Don't the Jenkins build outputs go into /scratch/groups/mcovert/jenkins/workspace*/? evaluationTime is one of the CORE analysis scripts so its output should be there for a PR build and a daily build, right?

rjuenemann commented 5 months ago

Don't the Jenkins build outputs go into /scratch/groups/mcovert/jenkins/workspace*/? evaluationTime is one of the CORE analysis scripts so its output should be there for a PR build and a daily build, right?

I think the output is here /scratch/groups/mcovert/wc_ecoli/daily_build/20240115.002046__Daily_build./wildtype_000000/000000/generation_000000/000000/plotOut