Closed klae01 closed 2 years ago
Ack - I've repro'ed and fixed this issue in a staging environment just need to confirm with rest of the team before I promote
I've uploaded the correct binaries to my own account as a staging environment and did 3 manual tests for each CPU and GPU environment and listed results here https://gist.github.com/msaroufim/8ed161ae98bd34c70cc94e12afe851ca
The problem is if we build our docker images using the regular Dockerfile
there's actually this line https://github.com/pytorch/serve/blob/release_0.5.3/docker/Dockerfile#L73 RUN python -m pip install -U setuptools && python -m pip install --no-cache-dir captum torchtext torchserve torch-model-archiver
which will install the latest release on pypi and essentially ignore the echo "-b, --branch_name=BRANCH_NAME specify a branch_name to use" parameter in our build_image.sh
script since that's also actually never passed in to what we call our production images.
The reason this never happened before is because the latest release version on pypi would match what is expected because we never updated an old image but when I made the hotfix update to our docker images a month ago when a newer release version was available it pulled the latest version 0.6.0
instead of the version that it needs to match 0.5.3
. so pip list ends up disagreeing with what's present in serve/ts/version.txt from inside the docker container. Although if we ever tried to promote docker images before promoting pypi we would have promoted bad binaries and it's pretty much a coincidence this hasn't happened so far.
Why not copy the files to the docker /tmp
folder and then run python -m pip install .
to replace python -m pip install --no-cache-dir torchserve
? In particular, the BRANCH_NAME
is not passed to the Dockerfile
, so it looks fine to build directly with the code in the repository.
https://github.com/klae01/serve-dockerfile-update/tree/feature/pip_install_from_repo
pip list
result obtained by changing only 6 lines in release_0.5.3
Package Version
-------------------- --------------
captum 0.5.0
certifi 2022.6.15
charset-normalizer 2.1.0
cycler 0.11.0
enum-compat 0.0.3
fonttools 4.34.4
future 0.18.2
idna 3.3
kiwisolver 1.4.4
matplotlib 3.5.2
numpy 1.23.1
packaging 21.3
Pillow 9.2.0
pip 22.1.2
pkg_resources 0.0.0
psutil 5.9.1
pyparsing 3.0.9
python-dateutil 2.8.2
requests 2.28.1
setuptools 63.2.0
six 1.16.0
torch 1.12.0+cpu
torch-model-archiver 0.6.0
torchserve 0.5.3b20220718
torchtext 0.13.0
torchvision 0.13.0+cpu
tqdm 4.64.0
typing_extensions 4.3.0
urllib3 1.26.10
wheel 0.37.1
Updated the 0.5.3 images just now
pip list
inpytorch/torchserve:0.5.3-gpu
return following. I foundtorchserve
version is0.6.0
. Alsopytorch/torchserve:0.5.3-cpu
also have same issuepytorch/torchserve:0.5.3-gpu
pytorch/torchserve:0.5.3-cpu