replicate / cog

Containers for machine learning
https://cog.run
Apache License 2.0
8.04k stars 561 forks source link

Timeout for older cog docker execution locally using cog predict #1857

Open hackkhai opened 3 months ago

hackkhai commented 3 months ago

Older dockers are stuck after pip installation and throws a timeout especially for the below one cog predict r8.im/google-research/maxim@sha256:494ca4d578293b4b93945115601b6a38190519da18467556ca223d219c3af9f9 -i 'image="https://replicate.delivery/mgxm/6707a57f-4957-4047-b020-2160aed1d27a/1fromGOPR0950.png"' -i 'model="Image Deblurring (GoPro)"'

Error: Building wheels for collected packages: maxim Building wheel for maxim (setup.py): started Building wheel for maxim (setup.py): finished with status 'done' Created wheel for maxim: filename=maxim-1.0.0-py3-none-any.whl size=23363 sha256=b67fbea49fdbf4448cee795c2b834fcc2ecbb68ff611733ba4e92ec70d309feb Stored in directory: /tmp/pip-ephem-wheel-cache-wzld9thq/wheels/65/55/ba/61cf444ccf0177534fa9c23b01c4cdaa5c3b89561a804c66f1 Successfully built maxim Installing collected packages: ml-collections, maxim Attempting uninstall: ml-collections Found existing installation: ml-collections 0.1.1 Uninstalling ml-collections-0.1.1: Successfully uninstalled ml-collections-0.1.1 Successfully installed maxim-1.0.0 ml-collections-0.1.0 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv WARNING: You are using pip version 22.0.4; however, version 24.2 is available. You should consider upgrading via the '/root/.pyenv/versions/3.8.13/bin/python3.8 -m pip install --upgrade pip' command. ⅹ Timed out

and when i tried the newest ones it works completely fine like the one below:

cog predict r8.im/meta/sam-2@sha256:fe97b453a6455861e3bac769b441ca1f1086110da7466dbb65cf1eecfd60dc83 -i 'image="https://replicate.delivery/pbxt/LMbGi83qiV3QXR9fqDIzTl0P23ZWU560z1nVDtgl0paCcyYs/cars.jpg"' -i 'use_m2m=true' -i 'points_per_side=32' -i 'pred_iou_thresh=0.88' -i 'stability_score_thresh=0.95'

8W9aG commented 2 months ago

This largely depends on what the code in predict is doing, and in the formers case it is running another pip install in the /src directory for the global namespace in python. I suppose since this was done the pypi registry has added more dependencies making their resolution harder than it used and hence the time its taking to resolve them is bleeding over the timeout.

In this case, perhaps we can make a '--timeout' flag as part of the CLI so the user can set the timeout manually.