jina-ai / jina

☁️ Build multimodal AI applications with cloud-native stack
https://docs.jina.ai
Apache License 2.0
20.54k stars 2.21k forks source link

Setting `return_type` to `DocList[OutputChatDoc]` produces errors #6175

Open netw0rkf10w opened 1 week ago

netw0rkf10w commented 1 week ago

Describe the bug

Hello, First of all I apologize in advance if this is not a bug.

I have a client code that looks like:

def get_chat_engine_response_doc(form_data, authToken, *args, **kwargs):
    jina_client = jinaClient(port=GRPC_PORT, protocol="gRPC")
    input_doc = InputDoc.parse_raw(form_data["json_object"])

    output_docs = jina_client.post(
        on="/get_output_doc",
        inputs=input_doc,
        return_type=DocList[OutputDoc],
    )
    return {"json_output_docs": output_docs.json()}

and a server code that looks like:

class MyExec(Executor):
    @requests(on="/get_output_doc")
    async def get_output_doc(self, doc: InputDoc, **kwargs):
        output_docs = await self.process_doc(doc)
        doclist = DocList[OutputDoc]()
        for output_doc in output_docs:
            doclist.append(output_doc)
        return doclist

My InputDoc and OutputDoc are both subclassed from BaseDoc.

In a previous version of my code, the function self.process_doc(doc) was returning only a single OutputDoc, so I set return_type=OutputDoc in the client function, and in the server function it was simply output_doc = await self.process_doc(doc); return output_docs = await self.process_doc(doc). This was working fine.

Recently I made some changes in my self.process_doc(doc) function to return a list of OutputDoc, and came up with the above version of the code. However, this modification broke the code:

ValueError: <DocList[OutputDoc] (length=1)> is not a <class 'docarray.documents.legacy.legacy_document.LegacyDocument'>  

Is this a bug or did I do something wrong?

Thank you very much in advance for your help!

JoanFM commented 1 week ago

Hello,

You need to add type annotation to your Executor code indicating the return type of the method to DocList[OutputDoc]

netw0rkf10w commented 1 week ago

Hello,

You need to add type annotation to your Executor code indicating the return type of the method to DocList[OutputDoc]

Thanks, @JoanFM. I had tried that but couldn't even start the executor:

Traceback (most recent call last):
  File "/home/envs/env2/lib/python3.11/site-packages/jina/serve/executors/decorators.py", line 393, in __set_name__
    self._inject_owner_attrs(owner, name, request_schema, response_schema)
  File "/home/envs/env2/lib/python3.11/site-packages/jina/serve/executors/decorators.py", line 351, in _inject_owner_attrs
    fn_with_schema = _FunctionWithSchema.get_function_with_schema(self.fn)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/envs/env2/lib/python3.11/site-packages/jina/serve/executors/__init__.py", line 329, in get_function_with_schema
    fn_with_schema.validate()
  File "/home/envs/env2/lib/python3.11/site-packages/jina/serve/executors/__init__.py", line 211, in validate
    raise Exception(
Exception: The response_schema schema for get_output_doc: <class 'data_models.InputDoc.InputDoc'> is not a BaseDoc. Please make sure that your endpoint used BaseDoc for request and response schema

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/code/deploy.py", line 6, in <module>
    from vocal.chat_engine.ChatEngine2 import ChatEngine
  File "/home/code/MyExec.py", line 57, in <module>
    class ChatEngine(Executor):
  File "/home/envs/env2/lib/python3.11/site-packages/jina/serve/executors/__init__.py", line 130, in __new__
    _cls = super().__new__(cls, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/envs/env2/lib/python3.11/site-packages/jina/jaml/__init__.py", line 526, in __new__
    _cls = super().__new__(cls, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Error calling __set_name__ on 'FunctionMapper' instance 'get_output_doc' in 'MyExec'
netw0rkf10w commented 1 week ago

Another very strange issue that I encountered is that when switching the return type annotation to OutputDoc:

async def get_output_doc(self, doc: InputDoc, **kwargs) -> OutputDoc

the executer couldn't start either, and I obtained a continuous stream of warnings that goes forever:

INFO   executor/rep-0@2704 start server bound to 0.0.0.0:54782                                                                                             [06/24/24 08:23:01]
WARNI… gateway@2705 Getting endpoints failed: Failed to get endpoints. Waiting for another trial                                                           [06/24/24 08:23:01]
WARNI… gateway@2705 Getting endpoints failed: Failed to get endpoints. Waiting for another trial                                                           [06/24/24 08:23:02]
WARNI… gateway@2705 Getting endpoints failed: Failed to get endpoints. Waiting for another trial                                                           [06/24/24 08:23:03]
WARNI… gateway@2705 Getting endpoints failed: Failed to get endpoints. Waiting for another trial                                                           [06/24/24 08:23:04]
JoanFM commented 1 week ago

Hello @netw0rkf10w,

Jina accepts two options

  1. Either DocList in, DocList out
  2. or Doc in Doc out but in a streaming way, this means that you should yield from the Executor method