jina-ai / dalle-flow

🌊 A Human-in-the-Loop workflow for creating HD images from text
grpcs://dalle-flow.dev.jina.ai
2.83k stars 211 forks source link

Local Server Error -- "RuntimeFailToStart" #15

Closed 6630507 closed 2 years ago

6630507 commented 2 years ago

I'm trying to run a local server in an anaconda environment; I've tried rebuilding the environment and even installed all dependencies manually, but I still get the same error. Running Mint/Ubuntu 20.04 with an RTX 3090 (needed to update PyTorch for the 3090, this initially threw and error as well)

The first error in seems to be "can not import module from home/user/ml/dalle/dalle-flow/executors/dalle/dm_helper.py"

DEBUG store/rep-0@17770 ready and listening [05/15/22 14:01:07] ⠙ Waiting dalle diffusion rerank upscaler store gateway ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/6 0:00:00DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses (raised from /home/leary/anaconda3/envs/dalle/lib/python3.8/site-packages/flatbuffers/compat.py:19) ⠹ Waiting dalle diffusion store ━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━ 3/6 0:00:00CRITI… dalle/rep-0@17773 can not load the executor from executors/dalle/config.yml [05/15/22 14:01:07] ⠴ Waiting dalle diffusion ━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━ 4/6 0:00:00ERROR dalle/rep-0@17773 ImportError('can not import module from [05/15/22 14:01:07] /home/leary/ml/dalle/dalle-flow/executors/dalle/dm_helper.py') during <class
'jina.serve.runtimes.worker.WorkerRuntime'> initialization
add "--quiet-error" to suppress the exception details

and the second error appears to be " AttributeError: module 'jaxlib.pocketfft' has no attribute 'pocketfft' "

which throws an additional " ImportError: can not import module from
/home/user/ml/dalle/dalle-flow/executors/dalle/dm_helper.py "

hanxiao commented 2 years ago

jax too old

6630507 commented 2 years ago

Thanks for your quick response.

I updated Jax to 0.3.12, but the local server still fails to run.

Now I'm getting an error

  ImportError: cannot import name 'isin' from 'jax._src.numpy.lax_numpy'                   

and also

dalle/rep-0@16335 ImportError('can not import module from [05/16/22 09:28:38] /home/user/ml/dalle-flow/dalle-flow/executors/dalle/dm_helper.py') during <class
'jina.serve.runtimes.worker.WorkerRuntime'> initialization

Here is the complete run:


MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWWWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNNNNNNNWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMNNNNNNNNNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNNNNNNNNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNNNWWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMWxxxxxxxxxOMMMMMNxxxxxxxxx0MMMMMKddddddxkKWMMMMMMMMMMMMXOxdddxONMMMM
MMMMMMMMMMMMXllllllllldMMMMM0lllllllllxMMMMMOllllllllllo0MMMMMMMM0olllllllllo0MM
MMMMMMMMMMMMXllllllllldMMMMM0lllllllllxMMMMMOlllllllllllloWMMMMMdllllllllllllldM
MMMMMMMMMMMMXllllllllldMMMMM0lllllllllxMMMMMOllllllllllllloMMMM0lllllllllllllllK
MMMMMMMMMMMMKllllllllldMMMMM0lllllllllxMMMMMOllllllllllllllKMMM0lllllllllllllllO
MMMMMMMMMMMMKllllllllldMMMMM0lllllllllxMMMMMOllllllllllllll0MMMMollllllllllllllO
MWOkkkkk0MMMKlllllllllkMMMMM0lllllllllxMMMMMOllllllllllllll0MMMMMxlllllllllllllO
NkkkkkkkkkMMKlllllllloMMMMMM0lllllllllxMMMMMOllllllllllllll0MMMMMMWOdolllllllllO
KkkkkkkkkkNMKllllllldMMMMMMMMWWWWWWWWWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MOkkkkkkk0MMKllllldXMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMWX00KXMMMMXxk0XMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM

/home/user/anaconda3/envs/dalle-flow/bin/jina flow
--uses flow.yml                                    
╭──────────────┬──────────────────────────────────╮
│     Argument │ Value                            │
├──────────────┼──────────────────────────────────┤
│          cli │ flow                             │
│          env │ None                             │
│      inspect │ COLLECT                          │
│   log-config │ default                          │
│         name │ None                             │
│        quiet │ False                            │
│  quiet-error │ False                            │
│         uses │ flow.yml                         │
│    workspace │ None                             │
│ workspace-id │ d9d7456330774ddd9d5c93bd1cb38617 │
╰──────────────┴──────────────────────────────────╯
UserWarning: It looks like you are trying to import multiple python modules using `py_modules`. When using multiple python files to define an executor, the recommended practice is to structure the files in a python package, and only import the `__init__.py` file of that package. For more details, please check out the cookbook: https://docs.jina.ai/fundamentals/executor/repository-structure/ (raised from /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/jaml/helper.py:260)
⠋  Waiting ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/0 -:--:--DEBUG  gateway/rep-0@16341 adding connection for deployment dalle/heads/0 to grpc://0.0.0.0:49629  [05/16/22 09:28:36]
DEBUG  upscaler/rep-0@16338 start listening on 0.0.0.0:58390                                       [05/16/22 09:28:36]
DEBUG  gateway/rep-0@16341 adding connection for deployment diffusion/heads/0 to                                      
       grpc://0.0.0.0:58837                                                                                           
DEBUG  gateway/rep-0@16341 adding connection for deployment rerank/heads/0 to grpc://0.0.0.0:63780                    
DEBUG  gateway/rep-0@16341 adding connection for deployment upscaler/heads/0 to                                       
       grpc://0.0.0.0:58390                                                                                           
DEBUG  gateway/rep-0@16341 adding connection for deployment store/heads/0 to grpc://0.0.0.0:51033                     
DEBUG  rerank/rep-0@16337 start listening on 0.0.0.0:63780                                         [05/16/22 09:28:36]
DEBUG  gateway/rep-0@16341 start server bound to 0.0.0.0:51005                                                        
DEBUG  rerank/rep-0@16332 ready and listening                                                      [05/16/22 09:28:37]
⠋  Waiting ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/0 -:--:--DEBUG  store/rep-0@16340 start listening on 0.0.0.0:51033                                          [05/16/22 09:28:37]
DEBUG  upscaler/rep-0@16332 ready and listening                                                    [05/16/22 09:28:37]
DEBUG  store/rep-0@16332 ready and listening                                                       [05/16/22 09:28:37]
DEBUG  gateway/rep-0@16332 ready and listening                                                     [05/16/22 09:28:37]
⠸ Waiting dalle diffusion ━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━ 4/6 0:00:00DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses (raised from /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/flatbuffers/compat.py:19)
⠦ Waiting dalle diffusion ━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━ 4/6 0:00:00Using device: cuda:0
⠴ Waiting dalle diffusion ━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━ 4/6 0:00:01CRITI… dalle/rep-0@16335 can not load the executor from executors/dalle/config.yml                 [05/16/22 09:28:38]
⠧ Waiting dalle diffusion ━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━ 4/6 0:00:01ERROR  dalle/rep-0@16335 ImportError('can not import module from                                   [05/16/22 09:28:38]
       /home/user/ml/dalle-flow/dalle-flow/executors/dalle/dm_helper.py') during <class                              
       'jina.serve.runtimes.worker.WorkerRuntime'> initialization                                                     
        add "--quiet-error" to suppress the exception details                                                         
       ╭─────────────────────────── Traceback (most recent call last) ───────────────────────────╮                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/importer.py:127  │                    
       │ in _path_import                                                                         │                    
       │                                                                                         │                    
       │   124 │   │   spec = importlib.util.spec_from_file_location(spec_name, absolute_path)   │                    
       │   125 │   │   module = importlib.util.module_from_spec(spec)                            │                    
       │   126 │   │   sys.modules[spec_name] = module                                           │                    
       │ ❱ 127 │   │   spec.loader.exec_module(module)                                           │                    
       │   128 │   except Exception as ex:                                                       │                    
       │   129 │   │   raise ImportError(f'can not import module from {absolute_path}') from ex  │                    
       │   130                                                                                   │                    
       │ <frozen importlib._bootstrap_external>:843 in exec_module                               │                    
       │ <frozen importlib._bootstrap>:219 in _call_with_frames_removed                          │                    
       │                                                                                         │                    
       │ /home/user/ml/dalle-flow/dalle-flow/executors/dalle/dm_helper.py:9 in <module>         │                    
       │                                                                                         │                    
       │     6 import numpy as np                                                                │                    
       │     7 import wandb                                                                      │                    
       │     8 from PIL import Image                                                             │                    
       │ ❱   9 from dalle_mini import DalleBart, DalleBartProcessor                              │                    
       │    10 from flax.jax_utils import replicate                                              │                    
       │    11 from flax.training.common_utils import shard_prng_key                             │                    
       │    12 from vqgan_jax.modeling_flax_vqgan import VQModel                                 │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/dalle_mini/__init__.… │                    
       │ in <module>                                                                             │                    
       │                                                                                         │                    
       │   1 __version__ = "0.0.6"                                                               │                    
       │   2                                                                                     │                    
       │ ❱ 3 from .model import DalleBart, DalleBartProcessor                                    │                    
       │   4                                                                                     │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/dalle_mini/model/__i… │                    
       │ in <module>                                                                             │                    
       │                                                                                         │                    
       │   1 from .configuration import DalleBartConfig                                          │                    
       │ ❱ 2 from .modeling import DalleBart                                                     │                    
       │   3 from .partitions import set_partitions                                              │                    
       │   4 from .processor import DalleBartProcessor                                           │                    
       │   5 from .tokenizer import DalleBartTokenizer                                           │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/dalle_mini/model/mod… │                    
       │ in <module>                                                                             │                    
       │                                                                                         │                    
       │     18 from functools import partial                                                    │                    
       │     19 from typing import Any, Dict, Optional, Tuple                                    │                    
       │     20                                                                                  │                    
       │ ❱   21 import flax                                                                      │                    
       │     22 import flax.linen as nn                                                          │                    
       │     23 import jax                                                                       │                    
       │     24 import jax.numpy as jnp                                                          │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/flax/__init__.py:20   │                    
       │ in <module>                                                                             │                    
       │                                                                                         │                    
       │   17 """Flax API."""                                                                    │                    
       │   18                                                                                    │                    
       │   19 from . import core                                                                 │                    
       │ ❱ 20 from . import linen                                                                │                    
       │   21 from . import optim                                                                │                    
       │   22 # DO NOT REMOVE - Marker for internal deprecated API.                              │                    
       │   23 # DO NOT REMOVE - Marker for internal logging.                                     │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/flax/linen/__init__.… │                    
       │ in <module>                                                                             │                    
       │                                                                                         │                    
       │   17                                                                                    │                    
       │   18 # pylint: disable=g-multiple-import                                                │                    
       │   19 # re-export commonly used modules and functions                                    │                    
       │ ❱ 20 from .activation import (celu, elu, gelu, glu, leaky_relu, log_sigmoid,            │                    
       │   21 │   │   │   │   │   │    log_softmax, relu, sigmoid, soft_sign, softmax,           │                    
       │   22 │   │   │   │   │   │    softplus, swish, silu, tanh, PReLU)                       │                    
       │   23 from .attention import (MultiHeadDotProductAttention, SelfAttention,               │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/flax/linen/activatio… │                    
       │ in <module>                                                                             │                    
       │                                                                                         │                    
       │   43                                                                                    │                    
       │   44 from typing import Any                                                             │                    
       │   45                                                                                    │                    
       │ ❱ 46 from flax.linen.module import Module, compact                                      │                    
       │   47 import jax.numpy as jnp                                                            │                    
       │   48                                                                                    │                    
       │   49                                                                                    │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/flax/linen/module.py… │                    
       │ in <module>                                                                             │                    
       │                                                                                         │                    
       │     29                                                                                  │                    
       │     30 import jax                                                                       │                    
       │     31 from jax import tree_util                                                        │                    
       │ ❱   32 from jax._src.numpy.lax_numpy import isin                                        │                    
       │     33 import numpy as np                                                               │                    
       │     34                                                                                  │                    
       │     35 import flax                                                                      │                    
       ╰─────────────────────────────────────────────────────────────────────────────────────────╯                    
       ImportError: cannot import name 'isin' from 'jax._src.numpy.lax_numpy'                                         
       (/home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jax/_src/numpy/lax_nump…                    

       The above exception was the direct cause of the following exception:                                           

       ╭─────────────────────────── Traceback (most recent call last) ───────────────────────────╮                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/orchestrate/pod… │                    
       │ in run                                                                                  │                    
       │                                                                                         │                    
       │    71 │                                                                                 │                    
       │    72 │   try:                                                                          │                    
       │    73 │   │   _set_envs()                                                               │                    
       │ ❱  74 │   │   runtime = runtime_cls(                                                    │                    
       │    75 │   │   │   args=args,                                                            │                    
       │    76 │   │   )                                                                         │                    
       │    77 │   except Exception as ex:                                                       │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/serve/runtimes/… │                    
       │ in __init__                                                                             │                    
       │                                                                                         │                    
       │    26 │   │   :param args: args from CLI                                                │                    
       │    27 │   │   :param kwargs: keyword args                                               │                    
       │    28 │   │   """                                                                       │                    
       │ ❱  29 │   │   super().__init__(args, **kwargs)                                          │                    
       │    30 │                                                                                 │                    
       │    31 │   async def async_setup(self):                                                  │                    
       │    32 │   │   """                                                                       │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/serve/runtimes/… │                    
       │ in __init__                                                                             │                    
       │                                                                                         │                    
       │    64 │   │   │   )                                                                     │                    
       │    65 │   │                                                                             │                    
       │    66 │   │   self._setup_monitoring()                                                  │                    
       │ ❱  67 │   │   self._loop.run_until_complete(self.async_setup())                         │                    
       │    68 │                                                                                 │                    
       │    69 │   def run_forever(self):                                                        │                    
       │    70 │   │   """                                                                       │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/asyncio/base_events.py:616 in       │                    
       │ run_until_complete                                                                      │                    
       │                                                                                         │                    
       │    613 │   │   if not future.done():                                                    │                    
       │    614 │   │   │   raise RuntimeError('Event loop stopped before Future completed.')    │                    
       │    615 │   │                                                                            │                    
       │ ❱  616 │   │   return future.result()                                                   │                    
       │    617 │                                                                                │                    
       │    618 │   def stop(self):                                                              │                    
       │    619 │   │   """Stop running the event loop.                                          │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/serve/runtimes/… │                    
       │ in async_setup                                                                          │                    
       │                                                                                         │                    
       │    32 │   │   """                                                                       │                    
       │    33 │   │   Start the DataRequestHandler and wait for the GRPC and Monitoring servers │                    
       │       start                                                                             │                    
       │    34 │   │   """                                                                       │                    
       │ ❱  35 │   │   await self._async_setup_grpc_server()                                     │                    
       │    36 │   │                                                                             │                    
       │    37 │   │   if self.metrics_registry:                                                 │                    
       │    38 │   │   │   with ImportExtensions(                                                │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/serve/runtimes/… │                    
       │ in _async_setup_grpc_server                                                             │                    
       │                                                                                         │                    
       │    63 │   │   # Keep this initialization order                                          │                    
       │    64 │   │   # otherwise readiness check is not valid                                  │                    
       │    65 │   │   # The DataRequestHandler needs to be started BEFORE the grpc server       │                    
       │ ❱  66 │   │   self._data_request_handler = DataRequestHandler(                          │                    
       │    67 │   │   │   self.args, self.logger, self.metrics_registry                         │                    
       │    68 │   │   )                                                                         │                    
       │    69                                                                                   │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/serve/runtimes/… │                    
       │ in __init__                                                                             │                    
       │                                                                                         │                    
       │    38 │   │   self.args.parallel = self.args.shards                                     │                    
       │    39 │   │   self.logger = logger                                                      │                    
       │    40 │   │   self._is_closed = False                                                   │                    
       │ ❱  41 │   │   self._load_executor(metrics_registry)                                     │                    
       │    42 │   │   self._init_monitoring(metrics_registry)                                   │                    
       │    43 │                                                                                 │                    
       │    44 │   def _init_monitoring(self, metrics_registry: Optional['CollectorRegistry'] =  │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/serve/runtimes/… │                    
       │ in _load_executor                                                                       │                    
       │                                                                                         │                    
       │    67 │   │   :param metrics_registry: Optional prometheus metrics registry that will b │                    
       │       passed to the executor so that it can expose metrics                              │                    
       │    68 │   │   """                                                                       │                    
       │    69 │   │   try:                                                                      │                    
       │ ❱  70 │   │   │   self._executor: BaseExecutor = BaseExecutor.load_config(              │                    
       │    71 │   │   │   │   self.args.uses,                                                   │                    
       │    72 │   │   │   │   uses_with=self.args.uses_with,                                    │                    
       │    73 │   │   │   │   uses_metas=self.args.uses_metas,                                  │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/jaml/__init__.p… │                    
       │ in load_config                                                                          │                    
       │                                                                                         │                    
       │   719 │   │   │                                                                         │                    
       │   720 │   │   │   if allow_py_modules:                                                  │                    
       │   721 │   │   │   │   _extra_search_paths = extra_search_paths or []                    │                    
       │ ❱ 722 │   │   │   │   load_py_modules(                                                  │                    
       │   723 │   │   │   │   │   no_tag_yml,                                                   │                    
       │   724 │   │   │   │   │   extra_search_paths=(_extra_search_paths + [os.path.dirname(s_ │                    
       │   725 │   │   │   │   │   if s_path                                                     │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/jaml/helper.py:… │                    
       │ in load_py_modules                                                                      │                    
       │                                                                                         │                    
       │   267 │   │   │   )                                                                     │                    
       │   268 │   │                                                                             │                    
       │   269 │   │   mod = [complete_path(m, extra_search_paths) for m in mod]                 │                    
       │ ❱ 270 │   │   PathImporter.add_modules(*mod)                                            │                    
       │   271                                                                                   │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/importer.py:152  │                    
       │ in add_modules                                                                          │                    
       │                                                                                         │                    
       │   149 │   │   │   │   │   f'cannot import module from {p}, file not exist'              │                    
       │   150 │   │   │   │   )                                                                 │                    
       │   151 │   │   │                                                                         │                    
       │ ❱ 152 │   │   │   _path_import(p)                                                       │                    
       │   153                                                                                   │                    
       │                                                                                         │                    
       │ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/importer.py:129  │                    
       │ in _path_import                                                                         │                    
       │                                                                                         │                    
       │   126 │   │   sys.modules[spec_name] = module                                           │                    
       │   127 │   │   spec.loader.exec_module(module)                                           │                    
       │   128 │   except Exception as ex:                                                       │                    
       │ ❱ 129 │   │   raise ImportError(f'can not import module from {absolute_path}') from ex  │                    
       │   130                                                                                   │                    
       │   131                                                                                   │                    
       │   132 class PathImporter:                                                               │                    
       ╰─────────────────────────────────────────────────────────────────────────────────────────╯                    
       ImportError: can not import module from                                                                        
       /home/user/ml/dalle-flow/dalle-flow/executors/dalle/dm_helper.py                                              
DEBUG  dalle/rep-0@16335 process terminated                                                                           
DEBUG  dalle/rep-0@16332 waiting for ready or shutdown signal from runtime                         [05/16/22 09:28:38]
DEBUG  dalle/rep-0@16332 shutdown is is already set. Runtime will end gracefully on its own                           
DEBUG  dalle/rep-0@16332 terminated                                                                                   
DEBUG  dalle/rep-0@16332 joining the process                                                                          
DEBUG  dalle/rep-0@16332 successfully joined the process                                                              
⠸ Waiting diffusion ━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━ 4/6 0:00:11DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. (raised from /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/torch/utils/tensorboard/__init__.py:4)
DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. (raised from /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/torch/utils/tensorboard/__init__.py:4)
⠙ Waiting diffusion ━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━ 4/6 0:00:20DEBUG  diffusion/rep-0@16336 start listening on 0.0.0.0:58837                                      [05/16/22 09:28:57]
DEBUG  diffusion/rep-0@16332 ready and listening                                                   [05/16/22 09:28:57]
ERROR  Flow@16332 Flow is aborted due to ['dalle'] can not be started.                             [05/16/22 09:28:57]
DEBUG  gateway/rep-0@16332 waiting for ready or shutdown signal from runtime                       [05/16/22 09:28:57]
DEBUG  gateway/rep-0@16332 terminate                                                                                  
DEBUG  gateway/rep-0@16332 terminating the runtime process                                                            
DEBUG  gateway/rep-0@16332 runtime process properly terminated                                                        
DEBUG  gateway/rep-0@16332 terminated                                                                                 
⠸ Waiting diffusion ━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━ 4/6 0:00:20DEBUG  gateway/rep-0@16341 process terminated                                                      [05/16/22 09:28:57]
DEBUG  gateway/rep-0@16332 joining the process                                                                        
DEBUG  gateway/rep-0@16332 successfully joined the process                                                            
DEBUG  store/rep-0@16332 waiting for ready or shutdown signal from runtime                         [05/16/22 09:28:57]
DEBUG  store/rep-0@16332 terminate                                                                                    
DEBUG  store/rep-0@16332 terminating the runtime process                                                              
DEBUG  store/rep-0@16332 runtime process properly terminated                                                          
⠸ Waiting diffusion ━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━ 4/6 0:00:20DEBUG  store/rep-0@16340 cancel WorkerRuntime                                                      [05/16/22 09:28:57]
DEBUG  store/rep-0@16340 stopped GRPC Server                                                                          
DEBUG  store/rep-0@16340 cancel WorkerRuntime                                                                         
DEBUG  store/rep-0@16340 stopped GRPC Server                                                                          
DEBUG  store/rep-0@16340 process terminated                                                        [05/16/22 09:28:57]
DEBUG  store/rep-0@16332 terminated                                                                                   
DEBUG  store/rep-0@16332 joining the process                                                                          
DEBUG  store/rep-0@16332 successfully joined the process                                                              
DEBUG  upscaler/rep-0@16332 waiting for ready or shutdown signal from runtime                      [05/16/22 09:28:57]
DEBUG  upscaler/rep-0@16332 terminate                                                                                 
DEBUG  upscaler/rep-0@16332 terminating the runtime process                                                           
DEBUG  upscaler/rep-0@16332 runtime process properly terminated                                                       
⠸ Waiting diffusion ━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━ 4/6 0:00:20DEBUG  upscaler/rep-0@16338 cancel WorkerRuntime                                                   [05/16/22 09:28:57]
⠴  Waiting ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━ 5/6 0:00:20DEBUG  upscaler/rep-0@16338 stopped GRPC Server                                                                       
DEBUG  upscaler/rep-0@16338 cancel WorkerRuntime                                                                      
DEBUG  upscaler/rep-0@16338 stopped GRPC Server                                                                       
DEBUG  upscaler/rep-0@16338 process terminated                                                     [05/16/22 09:28:57]
DEBUG  upscaler/rep-0@16332 terminated                                                                                
DEBUG  upscaler/rep-0@16332 joining the process                                                                       
DEBUG  upscaler/rep-0@16332 successfully joined the process                                                           
DEBUG  rerank/rep-0@16332 waiting for ready or shutdown signal from runtime                        [05/16/22 09:28:57]
DEBUG  rerank/rep-0@16332 terminate                                                                                   
DEBUG  rerank/rep-0@16332 terminating the runtime process                                                             
DEBUG  rerank/rep-0@16332 runtime process properly terminated                                                         
⠴  Waiting ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━ 5/6 0:00:20DEBUG  rerank/rep-0@16337 cancel WorkerRuntime                                                     [05/16/22 09:28:57]
DEBUG  rerank/rep-0@16337 stopped GRPC Server                                                                         
DEBUG  rerank/rep-0@16337 cancel WorkerRuntime                                                                        
DEBUG  rerank/rep-0@16337 stopped GRPC Server                                                                         
DEBUG  rerank/rep-0@16337 process terminated                                                       [05/16/22 09:28:57]
DEBUG  rerank/rep-0@16332 terminated                                                                                  
DEBUG  rerank/rep-0@16332 joining the process                                                                         
DEBUG  rerank/rep-0@16332 successfully joined the process                                                             
DEBUG  diffusion/rep-0@16332 waiting for ready or shutdown signal from runtime                                        
DEBUG  diffusion/rep-0@16332 terminate                                                                                
DEBUG  diffusion/rep-0@16332 terminating the runtime process                                                          
DEBUG  diffusion/rep-0@16332 runtime process properly terminated                                                      
⠴  Waiting ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━ 5/6 0:00:20DEBUG  diffusion/rep-0@16336 cancel WorkerRuntime                                                                     
DEBUG  diffusion/rep-0@16336 stopped GRPC Server                                                                      
DEBUG  diffusion/rep-0@16336 cancel WorkerRuntime                                                                     
DEBUG  diffusion/rep-0@16336 stopped GRPC Server                                                                      
DEBUG  diffusion/rep-0@16336 process terminated                                                    [05/16/22 09:28:57]
DEBUG  diffusion/rep-0@16332 terminated                                                                               
DEBUG  diffusion/rep-0@16332 joining the process                                                                      
DEBUG  diffusion/rep-0@16332 successfully joined the process                                       [05/16/22 09:28:58]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/user/anaconda3/envs/dalle-flow/bin/jina:8 in <module>                                     │
│                                                                                                  │
│   5 from cli import main                                                                         │
│   6 if __name__ == '__main__':                                                                   │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 8 │   sys.exit(main())                                                                         │
│   9                                                                                              │
│                                                                                                  │
│ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/cli/__init__.py:159 in main    │
│                                                                                                  │
│   156 │   │   │                                                                                  │
│   157 │   │   │   args = _get_run_args()                                                         │
│   158 │   │   │                                                                                  │
│ ❱ 159 │   │   │   getattr(api, args.cli.replace('-', '_'))(args)                                 │
│   160                                                                                            │
│                                                                                                  │
│ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/cli/api.py:216 in flow         │
│                                                                                                  │
│   213 │                                                                                          │
│   214 │   if args.uses:                                                                          │
│   215 │   │   f = Flow.load_config(args.uses)                                                    │
│ ❱ 216 │   │   with f:                                                                            │
│   217 │   │   │   f.block()                                                                      │
│   218 │   else:                                                                                  │
│   219 │   │   raise ValueError('start a flow from CLI requires a valid `--uses`')                │
│                                                                                                  │
│ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/orchestrate/flow/base.py: │
│ 1106 in __enter__                                                                                │
│                                                                                                  │
│   1103 │                                                                                         │
│   1104 │   def __enter__(self):                                                                  │
│   1105 │   │   with CatchAllCleanupContextManager(self):                                         │
│ ❱ 1106 │   │   │   return self.start()                                                           │
│   1107 │                                                                                         │
│   1108 │   def __exit__(self, exc_type, exc_val, exc_tb):                                        │
│   1109 │   │   if hasattr(self, '_stop_event'):                                                  │
│                                                                                                  │
│ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/orchestrate/flow/base.py: │
│ 1161 in start                                                                                    │
│                                                                                                  │
│   1158 │   │   │   if not v.external:                                                            │
│   1159 │   │   │   │   self.enter_context(v)                                                     │
│   1160 │   │                                                                                     │
│ ❱ 1161 │   │   self._wait_until_all_ready()                                                      │
│   1162 │   │                                                                                     │
│   1163 │   │   self._build_level = FlowBuildLevel.RUNNING                                        │
│   1164                                                                                           │
│                                                                                                  │
│ /home/user/anaconda3/envs/dalle-flow/lib/python3.8/site-packages/jina/orchestrate/flow/base.py: │
│ 1259 in _wait_until_all_ready                                                                    │
│                                                                                                  │
│   1256 │   │   │   │   │   f'Flow is aborted due to {error_deployments} can not be started.'     │
│   1257 │   │   │   │   )                                                                         │
│   1258 │   │   │   │   self.close()                                                              │
│ ❱ 1259 │   │   │   │   raise RuntimeFailToStart                                                  │
│   1260 │   │                                                                                     │
│   1261 │   │   if addr_table:                                                                    │
│   1262 │   │   │   print(                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeFailToStart
JoanFM commented 2 years ago

Thre must still be some problem with jax.

GallowsDove commented 2 years ago

I have the same issue after going through all the steps in the readme

dmd commented 2 years ago

Same.

foomonkey93 commented 2 years ago

I submitted a MR that updates the instructions, you explicitly need jax 0.3.13.

Pathos14489 commented 2 years ago

jax 0.3.13 is not helping the issue on my end. Python 3.9 and 3.10 both fail with the same issues.

foomonkey93 commented 2 years ago

I've been using python 3.7, I know 3.10 definitely doesn't support it. You also should manually install your related jaxlib whl https://whls.blob.core.windows.net/unstable/index.html

Pathos14489 commented 2 years ago

Installing Jax and Flax directly from their git repos worked, (I also pip upgraded jaxlib, unsure if it was necessary but just incase it is I'll mention it here.

Pathos14489 commented 2 years ago

Well. I apparently spoke too soon. I've now tried Python 3.7, 3.8, 3.9, and 3.10, I've gotten past the original error, and now I'm stuck on:

DEBUG  diffusion/rep-0@15933 connected to 192.168.1.244:51000                                                                                                                                 
(1, 512) <class 'numpy.ndarray'>
100 0
  0%|                                                                                                                                                                 | 0/100 [00:00<?, ?it/s]
ERROR  diffusion/rep-0@15933 RuntimeError('mat1 and mat2 shapes cannot be multiplied (2x512 and 768x1280)')                                                                [06/18/22 05:42:50]
        add "--quiet-error" to suppress the exception details

Edit: Extra System Details: OS: Ubuntu Server 22.04 LTS Freshly installed just for this purpose.

nerdyrodent commented 2 years ago

Recognised the error, but took a bit of documentation diving to figure out how to get clip-as-a-service to use something other than ViT-B/32. Basically, create a .yml file like this one:

jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_t
    uses:
      jtype: CLIPEncoder
      with: 
        jit: False
        device: cuda
        name: ViT-L/14@336px
      metas:
        py_modules:
          - executors/clip_torch.py

then start the clip server with that config file, e.g.

python -m clip_server caas.yml

That should get round the

RuntimeError('mat1 and mat2 shapes cannot be multiplied (2x512 and 768x1280)

issue