Closed ghost closed 2 years ago
pip install clip-server
resolves one issue, I can't figure out the executor.py issue, but it appears to be fixed in the "feat-dockerize" branch
Thank you for the reply @borrecw. I added clip-server to the requirements.txt and ran build again. everything worked up until throwing again a bunch of CUDA out of memory errors.
Is dalle flow even possible to run on a 3090? I know I had only 500 mb allocated from the 24 GB before running docker run, so I should have the required 21GB. Flow ends up providing the URLs for the server but don't seem to be working in the Colab. I tried the local, the public and the private link, not sure which one is supposed to be used.
I had the branch docker running using V100s on an AWS spot p3 instance, everything worked except upscaling before I was preempted
closing for now as we are trying to provide an auto-build docker image in next few hours. feel free to open the issue if the new image still doesn't work.
Hello, I ran the pre built image and even though I still get a CUDA out of memory the server seems to run nonetheless. However after copy and pasting the private or public address in the colab i still get this error:
Hello, I'm running into errors when running in Docker.
Right off the bat it throws this error:
`ERROR rerank/rep-0@21 ImportError('can not import module [06/20/22 13:48:52] from /dalle/dalle-flow/executors/rerank/executor.py') during <class 'jina.serve.runtimes.worker.WorkerRuntime'> initialization add "--quiet-error" to suppress the exception details ╭──────── Traceback (most recent call last) ────────╮ │ /usr/local/lib/python3.8/dist-packages/jina/impo… │ │ in _path_import │ │ │ │ 124 │ │ spec = importlib.util.spec_from_fil │ │ 125 │ │ module = importlib.util.module_from │ │ 126 │ │ sys.modules[spec_name] = module │ │ ❱ 127 │ │ spec.loader.exec_module(module) │ │ 128 │ except Exception as ex: │ │ 129 │ │ raise ImportError(f'can not import │ │ 130 │ │:848 in │
│ exec_module │
│ :219 in │
│ _call_with_frames_removed │
│ │
│ /dalle/dalle-flow/executors/rerank/executor.py:1 │
│ in │
│ │
│ ❱ 1 from clip_client import Client │
│ 2 from jina import Executor, requests, Documen │
│ 3 │
│ 4 │
│ │
│ /usr/local/lib/python3.8/dist-packages/clip_clie… │
│ in │
│ │
│ 5 from clip_client.client import Client │
│ 6 │
│ 7 if 'NO_VERSION_CHECK' not in os.environ: │
│ ❱ 8 │ from clip_server.helper import is_latest │
│ 9 │ │
│ 10 │ is_latest_version(github_repo='clip-as-s │
│ 11 │
╰───────────────────────────────────────────────────╯
ModuleNotFoundError: No module named 'clip_server'
then after downloading the mega model from wandb:
ModuleNotFoundError: No module named 'clip_server'
then gives me a CUDA out of memory ( I have an 24GB 3090 RTX, shouldn't be a problem right?) and then gets stuck in:
wandb: Downloading large artifact mega-1-fp16:latest, 4938.53MB. 7 files... Done. 0:0:7.3 device count: 1 DEBUG dalle/rep-0@19 <executor.DalleGenerator object at [06/20/22 13:51:01] 0x7f571825dc40> is successfully loaded! DEBUG dalle/rep-0@19 start listening on 0.0.0.0:59167 DEBUG dalle/rep-0@ 1 ready and listening [06/20/22 13:51:01] ERROR Flow@ 1 Flow is aborted due to ['diffusion', [06/20/22 13:51:01] 'rerank'] can not be started. DEBUG gateway/rep-0@ 1 waiting for ready or shutdown signal [06/20/22 13:51:01] from runtime DEBUG gateway/rep-0@ 1 terminate DEBUG gateway/rep-0@ 1 terminating the runtime process DEBUG gateway/rep-0@ 1 runtime process properly terminated DEBUG gateway/rep-0@ 1 terminated [06/20/22 13:51:02] DEBUG gateway/rep-0@ 1 joining the process DEBUG gateway/rep-0@35 process terminated [06/20/22 13:51:02] DEBUG gateway/rep-0@ 1 successfully joined the process DEBUG store/rep-0@ 1 waiting for ready or shutdown signal [06/20/22 13:51:02] from runtime DEBUG store/rep-0@ 1 terminate DEBUG store/rep-0@ 1 terminating the runtime process DEBUG store/rep-0@ 1 runtime process properly terminated DEBUG store/rep-0@23 cancel WorkerRuntime [06/20/22 13:51:02] DEBUG store/rep-0@23 stopped GRPC Server DEBUG store/rep-0@23 cancel WorkerRuntime DEBUG store/rep-0@23 stopped GRPC Server DEBUG store/rep-0@ 1 terminated DEBUG store/rep-0@ 1 joining the process DEBUG store/rep-0@23 process terminated [06/20/22 13:51:02] DEBUG store/rep-0@ 1 successfully joined the process DEBUG upscaler/rep-0@ 1 waiting for ready or shutdown [06/20/22 13:51:02] signal from runtime DEBUG upscaler/rep-0@ 1 terminate DEBUG upscaler/rep-0@ 1 terminating the runtime process DEBUG upscaler/rep-0@ 1 runtime process properly terminated DEBUG upscaler/rep-0@22 cancel WorkerRuntime [06/20/22 13:51:02] DEBUG upscaler/rep-0@22 stopped GRPC Server DEBUG upscaler/rep-0@22 cancel WorkerRuntime DEBUG upscaler/rep-0@22 stopped GRPC Server DEBUG upscaler/rep-0@ 1 terminated DEBUG upscaler/rep-0@ 1 joining the process DEBUG upscaler/rep-0@22 process terminated [06/20/22 13:51:02] DEBUG upscaler/rep-0@ 1 successfully joined the process DEBUG dalle/rep-0@ 1 waiting for ready or shutdown signal [06/20/22 13:51:02] from runtime DEBUG dalle/rep-0@ 1 terminate DEBUG dalle/rep-0@ 1 terminating the runtime process DEBUG dalle/rep-0@ 1 runtime process properly terminated DEBUG dalle/rep-0@19 cancel WorkerRuntime [06/20/22 13:51:02] DEBUG dalle/rep-0@19 stopped GRPC Server DEBUG dalle/rep-0@19 cancel WorkerRuntime DEBUG dalle/rep-0@19 stopped GRPC Server DEBUG dalle/rep-0@ 1 terminated DEBUG dalle/rep-0@19 process terminated [06/20/22 13:51:02] DEBUG dalle/rep-0@ 1 joining the process
Any ideas on what the problem is? Running on Ubuntu WSL2 in Windows 10