Closed Tuumke closed 4 months ago
The bigger issue is GitHub actions doesn’t use cached layers and so it rebuilds 100% of the time, negating the cachiness of docker.
I can look at adding it back if I can simplify it. It was originally removed so I didn’t have to ‘maintain’ more than 1 image. I can probably simplify it by moving a requirements.txt into the repo and adding that.
Could also have a seperate branche for it? That's how i maintained a dev and production container. Not sure if that would work in this case. Or maybe change the workflow to make another build with seperate Dockerfile
steps:
- uses: actions/checkout@v2
- name: Build the Docker image
run: docker build ./api/Service/ --file Dockerfile --tag my-image-name:$(date +%s)
Not sure if that works but got it from here -edit-
Or from same post:
steps:
- name: Login to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_PASSWORD }}
- name: Build and push register app
uses: docker/build-push-action@v3
with:
push: true
file: ./register/Dockerfile
tags: ${{ secrets.DOCKERHUB_USERNAME }}/register:latest
If you'd put it in a CPU folder would look like:
- name: Build and push
uses: docker/build-push-action@v5
with:
push: true
file: ./cpu/Dockerfile
tags: mccloud/subgen:cpu
Added the build back for a CPU version mccloud/subgen:cpu
It looks like it's only half-ish as small. I had to force it to remove nvidia dependencies that several packages force install, so let me know if it works. Open to any suggestions to make it smaller.
An alternative as well, which I just did, is I can set UPDATE (as a default) to True in the launcher, and stop automated builds, so it only gets the newest script. The launcher can grab the newest script.
For most docker users, the most recent pull, will keep you up to date with the latest subgen.py and not force you to re-pull Docker images (unless there are significant whisper changes, which are outside of the subgen.py scope)
Size is now 2.684GB. I wonder why this is still so big? Is it because of the base image?
-edit- ubuntu:22.04 is also only 28.17 MB. Must be one of the python packages then?
-edit2- Or are the models inside the image? If so, could try to change it so it downloads them from somewhere on start of docker?
-edit3- faster-whisper downloads the images from hugging face as we can read on faster-whisper github It downloads it from here where we can see the model is 3gb.
Do have an error now:
root@thebox:/mnt/user/dockers# docker logs -f subgen
File exists, but UPDATE is set to True. Downloading ./subgen.py from GitHub...
File downloaded successfully to ./subgen.py
File "/subgen/subgen.py", line 326
else:
^^^^
SyntaxError: invalid syntax
This is after i tried to set UPDATE=True for once :-)
Somehow I missed a parethensis. Should be good when you re-run.
Looking at the docker build, off the cuff: torch is 755mb, triton is 168mb. Other Pips easily add another 200mb or so. Not sure the size of the base 22.04 image.
Somehow I missed a parethensis. Should be good when you re-run.
Looking at the docker build, off the cuff: torch is 755mb, triton is 168mb. Other Pips easily add another 200mb or so. Not sure the size of the base 22.04 image.
I update my post where the size is probably coming from
Somehow I missed a parethensis. Should be good when you re-run.
Looking at the docker build, off the cuff: torch is 755mb, triton is 168mb. Other Pips easily add another 200mb or so. Not sure the size of the base 22.04 image.
Same on another line:
File exists, but UPDATE is set to True. Downloading ./subgen.py from GitHub...
File downloaded successfully to ./subgen.py
File "/subgen/subgen.py", line 363
else:
^^^^
SyntaxError: invalid syntax
The models are only downloaded when subgen tries to start the model, they aren't downloaded during the docker build at all.
Just fixed that one too, sorry for being the guinea pig. It takes about 10 minutes for github to 'publish' the update for the launcher. Not sure how mine didn't error out last night.
The models are only downloaded when subgen tries to start the model, they aren't downloaded during the docker build at all.
Just fixed that one too, sorry for being the guinea pig. It takes about 10 minutes for github to 'publish' the update for the launcher. Not sure how mine didn't error out last night.
-edit- Out for the next 2 hours or so, will try the new push once i get back
I don't mind at all! At least we get the errors out of the way ;-). Really like this docker/app mate! Thanks so far
(edited: btw if you want seperate issue for this, let me know?)
the regular subgen docker still works. The :cpu gives me:
File exists, but UPDATE is set to True. Downloading ./subgen.py from GitHub...
File downloaded successfully to ./subgen.py
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 176, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcudart.so.12: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/subgen/subgen.py", line 20, in <module>
import stable_whisper
File "/usr/local/lib/python3.10/dist-packages/stable_whisper/__init__.py", line 1, in <module>
from .whisper_word_level import *
File "/usr/local/lib/python3.10/dist-packages/stable_whisper/whisper_word_level/__init__.py", line 2, in <module>
from .cli import cli
File "/usr/local/lib/python3.10/dist-packages/stable_whisper/whisper_word_level/cli.py", line 8, in <module>
import torch
File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 236, in <module>
_load_global_deps()
File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 197, in _load_global_deps
_preload_cuda_deps(lib_folder, lib_name)
File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 162, in _preload_cuda_deps
raise ValueError(f"{lib_name} not found in the system path {sys.path}")
ValueError: libcublas.so.*[0-9] not found in the system path ['/subgen', '/usr/lib/python310.zip', '/usr/lib/python3.10', '/usr/lib/python3.10/lib-dynload', '/usr/local/lib/python3.10/dist-packages', '/usr/lib/python3/dist-packages']
Yup, that’s because I removed the nvidia cuda libraries. Guess it still uses them on cpu only.
On Thu, Mar 14, 2024 at 10:54 AM Tuumke @.***> wrote:
the regular subgen docker still works. The :cpu gives me:
File exists, but UPDATE is set to True. Downloading ./subgen.py from GitHub... File downloaded successfully to ./subgen.py Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/torch/init.py", line 176, in _load_global_deps ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL) File "/usr/lib/python3.10/ctypes/init.py", line 374, in init self._handle = _dlopen(self._name, mode) OSError: libcudart.so.12: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/subgen/subgen.py", line 20, in
import stable_whisper File "/usr/local/lib/python3.10/dist-packages/stable_whisper/init.py", line 1, in from .whisper_word_level import File "/usr/local/lib/python3.10/dist-packages/stable_whisper/whisper_word_level/init.py", line 2, in from .cli import cli File "/usr/local/lib/python3.10/dist-packages/stable_whisper/whisper_word_level/cli.py", line 8, in [0-9] not found in the system path ['/subgen', '/usr/lib/python310.zip', '/usr/lib/python3.10', '/usr/lib/python3.10/lib-dynload', '/usr/local/lib/python3.10/dist-packages', '/usr/lib/python3/dist-packages']import torch File "/usr/local/lib/python3.10/dist-packages/torch/init.py", line 236, in _load_global_deps() File "/usr/local/lib/python3.10/dist-packages/torch/init.py", line 197, in _load_global_deps _preload_cuda_deps(lib_folder, lib_name) File "/usr/local/lib/python3.10/dist-packages/torch/init.py", line 162, in _preload_cuda_deps raise ValueError(f"{lib_name} not found in the system path {sys.path}") ValueError: libcublas.so. — Reply to this email directly, view it on GitHub https://github.com/McCloudS/subgen/issues/57#issuecomment-1997907632, or unsubscribe https://github.com/notifications/unsubscribe-auth/APJACQORDTWPGH2OD54KZ33YYHIZ5AVCNFSM6AAAAABEPXQYV6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJXHEYDONRTGI . You are receiving this because you modified the open/close state.Message ID: @.***>
Changed the image to a debian-slim to see if we can save some space with the cuda packages added back. It's a 5.95 GB image vs the 7.92 GB GPU image. The original reason I stopped 'maintaining' it was the ~2GB difference wasn't worth the effort.
Now that the launcher will download script updates by default, and I disabled docker image rebuilds, there isn't much reason to maintain multiple images.
Alright. Do you need CUDA packages on CPU only though?
You’re previous errors show that we do.
You’re previous errors show that we do.
Ah like so. Thank you for looking into it anyways. I'll leave it up to you what you want to keep the extra image or not. I mean, if it's only 2gb difference
Would be sweet to have the option / tag / branch to have a smaller image again.
People use docker-compose to pull in their stack. Each time i do and subgen has a new docker image, i gotta pull in 6gb. This makes me want to update my docker containers less. It's not how docker is supposed to work. Docker is meant for the smallest possible images. So would be nice to have something without the CUDA images and associated packages for those running CPU only.
If you need any help with this, i've got some experience as well