Open RoryMMMM opened 1 month ago
After having a look at Issue 68 I tried to use a CPU only version of Torch. This moved the huge layer to the "pip install" layer. Which implies that Torch was being installed in the RUN env CC=mpicxx CXX=mpicxx pip install repast4py
step instead of the requirements.txt step.
So it looks like the problem was Torch all along. I'm not planning on using GPU support, for now, but ~1.5 Gigs is much better :
$ docker history 9a1cb25cd860
IMAGE CREATED CREATED BY SIZE COMMENT
9a1cb25cd860 2 minutes ago /bin/sh -c #(nop) ENV PYTHONPATH=/repast4py… 0B
2e321edc7386 2 minutes ago /bin/sh -c env CC=mpicxx CXX=mpicxx pip inst… 28.4MB
64236b6f3e79 3 minutes ago /bin/sh -c pip install -r ./requirements.txt 1.42GB
00430ad1e55c 8 minutes ago /bin/sh -c #(nop) COPY file:e3a9b2a1f65788bd… 418B
a45824896792 About an hour ago /bin/sh -c apt-get update && apt-get ins… 340MB
73b513f59526 2 years ago /bin/sh -c #(nop) CMD ["python3"] 0B
<missing> 2 years ago /bin/sh -c set -ex; savedAptMark="$(apt-ma… 9.51MB
<missing> 2 years ago /bin/sh -c #(nop) ENV PYTHON_GET_PIP_SHA256… 0B
<missing> 2 years ago /bin/sh -c #(nop) ENV PYTHON_GET_PIP_URL=ht… 0B
<missing> 2 years ago /bin/sh -c #(nop) ENV PYTHON_SETUPTOOLS_VER… 0B
<missing> 2 years ago /bin/sh -c #(nop) ENV PYTHON_PIP_VERSION=21… 0B
<missing> 2 years ago /bin/sh -c cd /usr/local/bin && ln -s idle3… 32B
<missing> 2 years ago /bin/sh -c set -ex && savedAptMark="$(apt-… 29.5MB
<missing> 2 years ago /bin/sh -c #(nop) ENV PYTHON_VERSION=3.10.0 0B
<missing> 2 years ago /bin/sh -c #(nop) ENV GPG_KEY=A035C8C19219B… 0B
<missing> 2 years ago /bin/sh -c set -eux; apt-get update; apt-g… 3.11MB
<missing> 2 years ago /bin/sh -c #(nop) ENV LANG=C.UTF-8 0B
<missing> 2 years ago /bin/sh -c #(nop) ENV PATH=/usr/local/bin:/… 0B
<missing> 2 years ago /bin/sh -c #(nop) CMD ["bash"] 0B
<missing> 2 years ago /bin/sh -c #(nop) ADD file:ece5ff85ca549f0b1… 80.4MB
Requirements.txt
# Using python:3.10.0-slim
# Use CPU Torch for a smaller install
--extra-index-url https://download.pytorch.org/whl/cpu
torch # https://github.com/Repast/repast4py/issues/68
mpi4py==4.0.0 #https://pypi.org/project/mpi4py/
numpy==1.26.4 #2.1.1
pandas==2.2.3
numba==0.60.0
coverage==7.6.1
networkx==3.3
pyyaml==6.0.2
Cython==3.0.11
llvmlite==0.43.0
I did a quick search in the git repo to see where Torch was being used:
This might be a silly question, but is Torch integral to Repast4Py?
Thanks for sharing your experience. I created #68 when encountering the same issue with creating Docker images. I wasn't able to get the docker image smaller than about what you are showing. I'll look into further ways to reduce the size of the repast4py installs based on individual use case requirements.
Thanks for looking into this! Let me know if I can help in any way.
I'm building up an environmnet to use in a HPC setting. The goal is to build a docker container with several Python packages and to then convert this to a Singularity file to be used in the univesity's Slurm HPC cluster.
The one thing I've noticed, which isn't great, is that the install size of Repast4py is huge. The docker image is ~8.2 GB in size. After taking a look at the docker image layers:
And the dockerfile (similar to the file in the repast git repo):
It's clear that the 8 Gig layer is coming from the
RUN env CC=mpicxx CXX=mpicxx pip install repast4py
command.Using a file this large isn't impossible, but it introduces some issues with storing this in a limited free repo, building it with CI/CD, moving it to nodes in the cluster etc.
Is there a simple way to reduce the build size? I can't really believe that it's using 8 gigs of compiled C code to run repast4py.