Closed thalassemia closed 5 months ago
I've played around with a bunch of profiling tools. pyinstrument and py-spy have been my favorites. memray was cool for memory profiling even if it didn't yield any obvious areas for improvement in our code.
Unfortunately, I don't have access to Sherlock. It would be super helpful if you could update the environment there for me.
I'll rerun the sims, compare with the script you mentioned, and report back.
I'll update the staging pyenv on Sherlock Saturday/Sunday then tweak setup-environment.sh.
Then let's coordinate after the PR build. @ggsun do you want to review any of the changes? When everyone's ready I can update the main pyenv, revert the setup-environment.sh tweak, and squash/merge to minimize the delay between steps.
I'm at a conference at Hawaii, happy to review this after I come back on Wednesday. Amazing time saving here!
retest this please
After lunch I'll switch the pyenv from staging to wcEcoli, update Sherlock's pyenv wcEcoli, and merge this.
On Sherlock, wcEcoli3-staging has numpy & scipy linked to OpenBLAS v0.3.26. See $GROUP_HOME/installation_notes/openblas.txt
.
Thanks for the suggestion @ggsun! I'm still trying to figure out why my changes have led to some differences in the sim output. It's kinda shocking to me that the calculated FBA fluxes can be so different without also throwing off the mass, doubling time, etc. I'll update again once I've finished my investigation.
Don't the Jenkins build outputs go into /scratch/groups/mcovert/jenkins/workspace*/
? evaluationTime is one of the CORE analysis scripts so its output should be there for a PR build and a daily build, right?
Don't the Jenkins build outputs go into
/scratch/groups/mcovert/jenkins/workspace*/
? evaluationTime is one of the CORE analysis scripts so its output should be there for a PR build and a daily build, right?
I think the output is here /scratch/groups/mcovert/wc_ecoli/daily_build/20240115.002046__Daily_build./wildtype_000000/000000/generation_000000/000000/plotOut
Got a little frustrated with the whole Accelerate rabbithole not yielding any performance gains, so I decided to try profiling our model in hopes of finding other avenues for optimization. Lo and behold, with just 2 small changes (here and here), I was able to cut the runtime for one generation by something like 20%. Adding in the JIT stuff brings the total improvement to about 25% on my machine (from 8m45s to 6m30s).
The little tweak to
nf_glpk.py
does not improve memory utilization because the unnecessary dense array is garbage collected. I just changed it because it stuck out like a sore thumb when profiling.