PNNL-CompBio / coderdata

Automation scripts and benchmark dataset package for cancer drug prediction deep learning models.
Other
11 stars 3 forks source link

Parallelization of build_all.py #175

Closed jjacobson95 closed 5 months ago

jjacobson95 commented 6 months ago

Everything is working. Ready to Merge.

In summary, this may be run on a high memory platform, fully validate, upload to figshare and and upload to pypi with the following commands:

export SYNAPSE_AUTH_TOKEN="..."
export PYPI_TOKEN="..."
export FIGSHARE_TOKEN="..."

python build/build_all.py --all --high_mem --pypi --figshare --version 0.1.29

Edit - Also manually updated all files needed to update github pages which is now live.

jjacobson95 commented 6 months ago

@sgosline This should be ready to go. Full with with high_mem (and without) was successful. Latest HCMI changes haven't been tested yet, but that shouldn't impact this PR specifically.

This took ~16 hours from start to finish on a c5.9xlarge EC2 instance (36vCPUs, 72Gb Memory).

jjacobson95 commented 6 months ago

One note, the build/docker/Dockerfile.upload file will have to be updated to remove this line after testing is complete. RUN git checkout docker-build-multi