jina-ai / serve

☁️ Build multimodal AI applications with cloud-native stack
https://jina.ai/serve
Apache License 2.0
21.14k stars 2.22k forks source link

Two fields with the same type are conflicted with KeyError in pydantic #6140

Closed oytuntez closed 5 months ago

oytuntez commented 9 months ago

Describe the bug I have a type that looks like this:

class QuoteFileType(BaseDoc):
    """
        QuoteFileType class.
    """
    id: str = None  # same as name, compatibility reasons for a generic, shared `id` field
    name: str = None
    total_count: int = None
    docs: DocList[QuoteFile] = None
    chunks: DocList[QuoteFile] = None

This was working fine until v3.23.3, see PR: https://github.com/jina-ai/jina/pull/6138

With 3.23.3, we are now getting this error while initializing the gateway:

DEBUG  gateway/rep-0@81761 gRPC call to hnsw for                                
       EndpointDiscovery errored, with error <AioRpcError of                    
       RPC that terminated with:                                                
               status = StatusCode.UNKNOWN                                      
               details = "Unexpected <class 'KeyError'>:                        
       <class 'pydantic.main.ImageDocument'>"                                   
               debug_error_string = "UNKNOWN:Error received                     
       from peer                                                                
       {created_time:"2024-02-16T13:28:39.399741-05:00",                        
       grpc_status:2, grpc_message:"Unexpected <class                           
       \'KeyError\'>: <class                                                    
       \'pydantic.main.ImageDocument\'>"}"                                      
       > and for the 2th time.                         

Describe how you solve it The issue disappears if I remove one of the DocList[QuoteFile] fields (docs or chunkschunks is there for backward compatibility in one of our use cases).

Environment jina==3.23.3 docarray==latest upstream with our own fork (i don't think any of our changes are related to this issue)

oytuntez commented 9 months ago

I think this line is throwing the exception in helper._create_aux_model_doc_list_to_list:

create_model(
                model.__name__,
                __base__=model,
                __validators__=model.__validators__,
                **fields,
            )

Somehow I can't see proper exceptions or stack traces in my Jina flows...

JoanFM commented 9 months ago

Can you please share a minimal reproducible example of the issue? something that I can run and see the problem right away?

JoanFM commented 9 months ago

The reason these exceptions are not easily seen is because all these collectiing of schemas from Executors is done asynchronously and these exceptions are not so well reported.

JoanFM commented 9 months ago

Minimal Reproducible Example showcasing the underlying problem:

from docarray import DocList, BaseDoc
from jina.serve.runtimes.helper import _create_aux_model_doc_list_to_list

class MyTextDoc(BaseDoc):
    text: str

class QuoteFile(BaseDoc):
    texts: DocList[MyTextDoc]

class QuoteFileType(BaseDoc):
    """
        QuoteFileType class.
    """
    id: str = None  # same as name, compatibility reasons for a generic, shared `id` field
    name: str = None
    total_count: int = None
    docs: DocList[QuoteFile] = None
    chunks: DocList[QuoteFile] = None

new_model = _create_aux_model_doc_list_to_list(QuoteFileType)
new_model.schema()

Somehow the new_model has a problem when calling .schema(), there is where the KeyError is raised.

JoanFM commented 9 months ago

I have a candidate PR to fix https://github.com/jina-ai/jina/pull/6141

I will try to soon add tests, but u can test in advance if possible

oytuntez commented 9 months ago

hmm, didn't work quite well; we may need to bring cached_models from the topology graph – with a local cached_models list, it still gives the same error

JoanFM commented 9 months ago

hmm, didn't work quite well; we may need to bring cached_models from the topology graph – with a local cached_models list, it still gives the same error

can you please add the example that is giving the error. Otherwise is very hard to debug.

JoanFM commented 9 months ago

I have updated the PR, can you try again and provide a minimal reproducible example for us to make sure we add relevant testing scenarios?

jina-bot commented 6 months ago

@jina-ai/product This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days