AlexsLemonade / OpenPBTA-analysis

The analysis repository for the Open Pediatric Brain Tumor Atlas Project
Other
100 stars 67 forks source link

Revision: Big Run on AWS #1680

Closed sjspielman closed 1 year ago

sjspielman commented 1 year ago

We want to do Big Run on AWS to obtain a time/cost estimate.

RUN_LOCAL=0 bash scripts/run-manuscript-analyses.sh
RUN_LOCAL=0 bash figures/generate-figures.sh

This requires ^ scripts be all set for resubmission.

sjspielman commented 1 year ago

Attached to this comment is the stdout from running RUN_LOCAL=0 bash scripts/run-manuscript-analyses.sh, at the bottom of which is the output fromtime, run with 64 GB RAM on 4 cores:

real 25687.98
user 3.60
sys 7.14

That's 25687.98/60 = 428.133 minutes.

For posterity, here's what I actually ran on the AWS instance -

nohup time -p docker run --rm --memory=64g --cpus=4 --volume="/home/ubuntu/OpenPBTA-analysis:/rocker-build:" ccdlopenpbta/open-pbta:latest bash run-manuscript-analyses-wrapped.sh &

Where the contents of run-manuscript-analyses-wrapped.sh are simply: RUN_LOCAL=0 bash scripts/run-manuscript-analyses.sh. This script was really only needed for me to get that RUN_LOCAL=0 variable in there before bash without docker complaining. The final nohup file was renamed to stdout_run-manuscript-analyses.txt and uploaded to this comment.

stdout_run-manuscript-analyses.txt

sjspielman commented 1 year ago

Attached to this comment is the stdout from running RUN_LOCAL=0 bash figures/generate-figures.sh, at the bottom of which is the output fromtime, run with 64 GB RAM on 4 cores:

real 6273.08
user 0.25
sys 0.44

That's 6273.08/60 = 104.5513 minutes. I want to note that the vast, and I mean vast (not formally profiled, but my guess is about 90% of the runtime) is running and exporting snv-callers PBTA (not TCGA) figures: https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/ba5b09344b0f9c78d196d0d8c97a891b999bf0ea/figures/generate-figures.sh#L137-L140

For posterity, here's what I actually ran on the AWS instance -

nohup time -p docker run --rm --memory=64g --cpus=4 --volume="/home/ubuntu/OpenPBTA-analysis:/rocker-build:" ccdlopenpbta/open-pbta:latest bash run-generate-figures.sh &

Where the contents of run-generate-figures.sh are simply: RUN_LOCAL=0 bash figures/generate-figures.sh. This script was really only needed for me to get that RUN_LOCAL=0 variable in there before bash without docker complaining. The final nohup file was renamed to stdout_generate-figures.txt and uploaded to this comment.

stdout_generate-figures.txt

sjspielman commented 1 year ago

Punchlines for cost estimation:


64 GB RAM, 4 cores

Analyses: 25687.98 s Figures: 6273.08 s

sjspielman commented 1 year ago

Woops, this needs to stay open as there are code changes from the Big Run that need to be PR'd!