jina-ai / executors

internal-only
Apache License 2.0
31 stars 12 forks source link

DPRTextEncoder unable to start with context as encode type #305

Open winstonww opened 2 years ago

winstonww commented 2 years ago

In a discussion with @JoanFM, it was found that the DPRTextEncoder failed to start if encoding type is context.

An minimal example to illustrate this:

from jina import Flow
f = Flow().add(uses='jinahub+docker://DPRTextEncoder', uses_with={'encoder_type': 'context'})
with f:
   pass

Results:

      executor0@31205[I]:Traceback (most recent call last):
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/jina/peapods/runtimes/request_handlers/data_request_handler.py", line 78, in _load_executor
      executor0@31205[I]:extra_search_paths=self.args.extra_search_paths,
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/jina/jaml/__init__.py", line 613, in load_config
      executor0@31205[I]:return JAML.load(tag_yml, substitute=False)
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/jina/jaml/__init__.py", line 90, in load
      executor0@31205[I]:r = yaml.load(stream, Loader=JinaLoader)
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/yaml/__init__.py", line 81, in load
      executor0@31205[I]:return loader.get_single_data()
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/yaml/constructor.py", line 51, in get_single_data
      executor0@31205[I]:return self.construct_document(node)
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/yaml/constructor.py", line 55, in construct_document
      executor0@31205[I]:data = self.construct_object(node)
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/yaml/constructor.py", line 100, in construct_object
      executor0@31205[I]:data = constructor(self, node)
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/jina/jaml/__init__.py", line 452, in _from_yaml
      executor0@31205[I]:return get_parser(cls, version=data.get('version', None)).parse(cls, data)
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/jina/jaml/parsers/executor/legacy.py", line 73, in parse
      executor0@31205[I]:runtime_args=data.get('runtime_args', {}),
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/jina/executors/decorators.py", line 60, in arg_wrapper
      executor0@31205[I]:f = func(self, *args, **kwargs)
      executor0@31205[I]:File "/workspace/dpr_text.py", line 85, in __init__
      executor0@31205[I]:pretrained_model_name_or_path
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/transformers/modeling_utils.py", line 1357, in from_pretrained
      executor0@31205[I]:_fast_init=_fast_init,
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/transformers/modeling_utils.py", line 1455, in _load_state_dict_into_model
      executor0@31205[I]:model._init_weights(module)
      executor0@31205[I]:File "/usr/local/lib/python3.7/site-packages/transformers/modeling_utils.py", line 579, in _init_weights
      executor0@31205[I]:raise NotImplementedError(f"Make sure `_init_weigths` is implemented for {self.__class__}")
      executor0@31205[I]:NotImplementedError: Make sure `_init_weigths` is implemented for <class 'transformers.models.dpr.modeling_dpr.DPRContextEncoder'>
winstonww commented 2 years ago

After looking into this, since question and context uses different transformer models and have different model paths/repos, we can fix the issue by setting the model path to facebook/dpr-ctx_encoder-single-nq-base when initializing the context encoder.

JoanFM commented 2 years ago

After looking into this, since question and context uses different transformer models and have different model paths/repos, we can fix the issue by setting the model path to facebook/dpr-ctx_encoder-single-nq-base when initializing the context encoder.

please add this info in the executor readme or raise a meaningful exception message