jina-ai / executor-image-clip-classifier

6 stars 1 forks source link

CLIPImageClassifier error #15

Open sk-haghighi opened 2 years ago

sk-haghighi commented 2 years ago

I tried to run the following flow on "jinahub+sandbox" but I got the following error could you please share your insight with me? I am running the code from my Jupyter notebook.

import warnings warnings.filterwarnings("ignore", category=DeprecationWarning) from jina import Flow classes = ['this is a cat','this is a dog','this is a person'] doc = Document(uri='image/dog.jpg') docs = DocumentArray() docs.append(doc) f = Flow().add( uses='jinahub://CLIPImageClassifier',name="classifier", uses_with={'classes':classes})

with f: f.post(on='/classify', inputs=docs, on_done=lambda resp: print(resp.docs[0].tags['class']['label']))

-----------------------error------------------ PkgResourcesDeprecationWarning: 1.1build1 is an invalid version and will not be supported in a future release (raised from /home/ubuntu/pyenv/lib/python3.10/site-packages/pkg_resources/init.py:116) PkgResourcesDeprecationWarning: 0.1.43ubuntu1 is an invalid version and will not be supported in a future release (raised from /home/ubuntu/pyenv/lib/python3.10/site-packages/pkg_resources/init.py:116) UserWarning: VersionConflict(torchvision 0.12.0+cpu (/usr/local/lib/python3.10/dist-packages), Requirement.parse('torchvision==0.10.0')) (raised from /home/ubuntu/pyenv/lib/python3.10/site-packages/jina/hubble/helper.py:483) ftfy or spacy is not installed using BERT BasicTokenizer instead of ftfy. ╭────── 🎉 Flow is ready to serve! ──────╮ │ 🔗 Protocol GRPC │ │ 🏠 Local 0.0.0.0:55600 │ │ 🔒 Private 172.31.17.247:55600 │ │ 🌍 Public 34.221.179.218:55600 │ ╰────────────────────────────────────────╯ ERROR classifier/rep-0@21134 AttributeError("'DocumentArrayInMemory' [07/06/22 16:34:35] object has no attribute 'get_attributes'")
add "--quiet-error" to suppress the exception details
╭────────────── Traceback (most recent call last) ───────────────╮
│ /home/ubuntu/pyenv/lib/python3.10/site-packages/jina/serve/ru… │
│ in process_data │
│ │
│ 162 │ │ │ │ if self.logger.debug_enabled: │
│ 163 │ │ │ │ │ self._log_data_request(requests[0]) │
│ 164 │ │ │ │ │
│ ❱ 165 │ │ │ │ return await self._data_request_handler. │
│ 166 │ │ │ except (RuntimeError, Exception) as ex: │
│ 167 │ │ │ │ self.logger.error( │
│ 168 │ │ │ │ │ f'{ex!r}' │
│ │
│ /home/ubuntu/pyenv/lib/python3.10/site-packages/jina/serve/ru… │
│ in handle │
│ │
│ 147 │ │ ) │
│ 148 │ │ │
│ 149 │ │ # executor logic │
│ ❱ 150 │ │ return_data = await self._executor.acall( │
│ 151 │ │ │ req_endpoint=requests[0].header.exec_endpoin │
│ 152 │ │ │ docs=docs, │
│ 153 │ │ │ parameters=params, │
│ │
│ /home/ubuntu/pyenv/lib/python3.10/site-packages/jina/serve/ex… │
│ in acall
│ │
│ 271 │ │ if req_endpoint in self.requests: │
│ 272 │ │ │ return await self.acall_endpoint__(req_end │
│ 273 │ │ elif
default_endpoint in self.requests: │
│ ❱ 274 │ │ │ return await self.
acall_endpoint(defau │
│ 275 │ │
│ 276 │ async def acall_endpoint(self, req_endpoint, **k │
│ 277 │ │ func = self.requests[req_endpoint] │
│ │
│ /home/ubuntu/pyenv/lib/python3.10/site-packages/jina/serve/ex… │
│ in
acall_endpoint

│ │
│ 292 │ │ │ if iscoroutinefunction(func): │
│ 293 │ │ │ │ return await func(self, kwargs) │
│ 294 │ │ │ else: │
│ ❱ 295 │ │ │ │ return func(self,
kwargs) │
│ 296 │ │
│ 297 │ @property │
│ 298 │ def workspace(self) -> Optional[str]: │
│ │
│ /home/ubuntu/pyenv/lib/python3.10/site-packages/jina/serve/ex… │
│ in arg_wrapper │
│ │
│ 177 │ │ │ │ def arg_wrapper( │
│ 178 │ │ │ │ │ executor_instance, *args, *kwargs │
│ 179 │ │ │ │ ): # we need to get the summary from th │
│ the self │
│ ❱ 180 │ │ │ │ │ return fn(executor_instance,
args, │
│ 181 │ │ │ │ │
│ 182 │ │ │ │ self.fn = arg_wrapper │
│ 183 │
│ │
│ /home/ubuntu/.jina/hub-package/9k3zudzu/clip_image_classifier… │
│ in classify │
│ │
│ 56 │ │ for docs_batch in docs.traverse_flat( │
│ 57 │ │ │ parameters.get('traversal_paths', self.traver │
│ 58 │ │ ).batch(batch_size=parameters.get('batch_size', s │
│ ❱ 59 │ │ │ image_batch = docs_batch.get_attributes('blob │
│ 60 │ │ │ with torch.inference_mode(): │
│ 61 │ │ │ │ input = self._generate_input_features(cla │
│ 62 │ │ │ │ outputs = self.model(**input) │
╰────────────────────────────────────────────────────────────────╯
AttributeError: 'DocumentArrayInMemory' object has no attribute
'get_attributes'
Exception in thread Thread-107: Traceback (most recent call last): File "/home/ubuntu/pyenv/lib/python3.10/site-packages/jina/clients/base/grpc.py", line 86, in _get_results async for resp in stub.Call( File "/home/ubuntu/pyenv/lib/python3.10/site-packages/grpc/aio/_call.py", line 326, in _fetch_stream_responses await self._raise_for_status() File "/home/ubuntu/pyenv/lib/python3.10/site-packages/grpc/aio/_call.py", line 236, in _raise_for_status raise _create_rpc_error(await self.initial_metadata(), await grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with: status = StatusCode.UNKNOWN details = "Unexpected <class 'grpc.aio._call.AioRpcError'>: <AioRpcError of RPC that terminated with: status = StatusCode.UNKNOWN details = "Unexpected <class 'TypeError'>: format_exception() got an unexpected keyword argument 'etype'" debug_error_string = "{"created":"@1657125275.618452649","description":"Error received from peer ipv4:0.0.0.0:58903","file":"src/core/lib/surface/call.cc","file_line":952,"grpc_message":"Unexpected <class 'TypeError'>: format_exception() got an unexpected keyword argument 'etype'","grpc_status":2}"

" debug_error_string = "{"created":"@1657125275.619606817","description":"Error received from peer ipv4:0.0.0.0:55600","file":"src/core/lib/surface/call.cc","file_line":952,"grpc_message":"Unexpected <class 'grpc.aio._call.AioRpcError'>: <AioRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNKNOWN\n\tdetails = "Unexpected <class 'TypeError'>: format_exception() got an unexpected keyword argument 'etype'"\n\tdebug_error_string = "{"created":"@1657125275.618452649","description":"Error received from peer ipv4:0.0.0.0:58903","file":"src/core/lib/surface/call.cc","file_line":952,"grpc_message":"Unexpected <class 'TypeError'>: format_exception() got an unexpected keyword argument 'etype'","grpc_status":2}"\n>","grpc_status":2}"

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/lib/python3.10/threading.py", line 1009, in _bootstrap_inner self.run() File "/home/ubuntu/pyenv/lib/python3.10/site-packages/jina/helper.py", line 1292, in run self.result = asyncio.run(func(*args, *kwargs)) File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "/usr/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete return future.result() File "/home/ubuntu/pyenv/lib/python3.10/site-packages/jina/clients/mixin.py", line 164, in _get_results async for resp in c._get_results(args, **kwargs): File "/home/ubuntu/pyenv/lib/python3.10/site-packages/jina/clients/base/grpc.py", line 155, in _get_results raise e File "/home/ubuntu/pyenv/lib/python3.10/site-packages/jina/clients/base/grpc.py", line 135, in _get_results raise BadClient(msg) from err jina.excepts.BadClient: gRPC error: StatusCode.UNKNOWN Unexpected <class 'grpc.aio._call.AioRpcError'>: <AioRpcError of RPC that terminated with: status = StatusCode.UNKNOWN details = "Unexpected <class 'TypeError'>: format_exception() got an unexpected keyword argument 'etype'" debug_error_string = "{"created":"@1657125275.618452649","description":"Error received from peer ipv4:0.0.0.0:58903","file":"src/core/lib/surface/call.cc","file_line":952,"grpc_message":"Unexpected <class 'TypeError'>: format_exception() got an unexpected keyword argument 'etype'","grpc_status":2}"


AttributeError Traceback (most recent call last) File ~/pyenv/lib/python3.10/site-packages/jina/helper.py:1307, in run_async(func, *args, **kwargs) 1306 try: -> 1307 return thread.result 1308 except AttributeError:

AttributeError: '_RunThread' object has no attribute 'result'

During handling of the above exception, another exception occurred:

BadClient Traceback (most recent call last) Input In [15], in <cell line: 12>() 8 f = Flow().add( 9 uses='jinahub://CLIPImageClassifier',name="classifier", 10 uses_with={'classes':classes}) 12 with f: ---> 13 f.post(on='/classify', inputs=docs, on_done=lambda resp: print(resp.docs[0].tags['class']['label']))

File ~/pyenv/lib/python3.10/site-packages/jina/clients/mixin.py:173, in PostMixin.post(self, on, inputs, on_done, on_error, on_always, parameters, target_executor, request_size, show_progress, continue_on_error, return_responses, kwargs) 170 if return_results: 171 return result --> 173 return run_async( 174 _get_results, 175 inputs=inputs, 176 on_done=on_done, 177 on_error=on_error, 178 on_always=on_always, 179 exec_endpoint=on, 180 target_executor=target_executor, 181 parameters=parameters, 182 request_size=request_size, 183 kwargs, 184 )

File ~/pyenv/lib/python3.10/site-packages/jina/helper.py:1311, in run_async(func, *args, **kwargs) 1308 except AttributeError: 1309 from jina.excepts import BadClient -> 1311 raise BadClient( 1312 'something wrong when running the eventloop, result can not be retrieved' 1313 ) 1314 else: 1316 raise RuntimeError( 1317 'you have an eventloop running but not using Jupyter/ipython, ' 1318 'this may mean you are using Jina with other integration? if so, then you ' 1319 'may want to use Client/Flow(asyncio=True). If not, then ' 1320 'please report this issue here: https://github.com/jina-ai/jina' 1321 )

BadClient: something wrong when running the eventloop, result can not be retrieved

numb3r3 commented 2 years ago

I believe this PR can address your problem. Now, we have a more powerful executor which integrates CLIP model. We recommend to use this new executor which offers the classification feature via /rank endpoint. As an example:

from docarray import Document

d = Document(
    uri='rerank.png',
    matches=[
        Document(text=f'a photo of a {p}', uri='/path/to/room.png')
        for p in (
            'control room',
            'lecture room',
            'conference room',
            'podium indoor',
            'television studio',
        )
    ],
)

f = Flow().add(
    uses='jinahub+docker://CLIPTorchEncoder',
)
with f:
    r = f.post(on='/rank', inputs=da)
    print(r['@m', ['text', 'scores__clip_score__value']])

For more details, you can check https://hub.jina.ai/executor/gzpbl8jh

sk-haghighi commented 2 years ago

@numb3r3 thanks for your response: Could you please clarify the following observations that I had in your source code: (1) the input to the with f: should be d right? (2) second {p} this looks like you are creating a set of classes to return the relevant similarity scores for each class what is the purpose of having "uri='/path/to/room.png'" here? since I see you have a uri='rerank.png' on top. (3) also I think d for you is the input, if this is the case don't we need to apply any input preprocessing before sending the image to the flow?

sk-haghighi commented 2 years ago

@numb3r3 I tried to do the embedding first and I got the following error import warnings warnings.filterwarnings("ignore", category=DeprecationWarning) from jina import Flow from docarray import Document, DocumentArray import numpy as np

q = DocumentArray( [Document(uri='superior-pump.jpg') .load_uri_to_image_tensor() .set_image_tensor_normalization() .set_image_tensor_channel_axis(-1, 0) ]) f = Flow().add( uses='jinahub+sandbox://CLIPTorchEncoder', install_requirements = True, name='CTE' ) with f: f.post(on='/', inputs=q) q.summary()


E0712 17:37:42.842941072 335007 hpack_parser.cc:1234] Error parsing metadata: error=invalid value key=content-type value=text/plain; charset=utf-8 Exception in thread Thread-99: Traceback (most recent call last): File "/home/ubuntu/my_env_p3.9/lib/python3.9/site-packages/jina/clients/base/grpc.py", line 83, in _get_results async for resp in stub.Call( File "/home/ubuntu/my_env_p3.9/lib/python3.9/site-packages/grpc/aio/_call.py", line 326, in _fetch_stream_responses await self._raise_for_status() File "/home/ubuntu/my_env_p3.9/lib/python3.9/site-packages/grpc/aio/_call.py", line 236, in _raise_for_status raise _create_rpc_error(await self.initial_metadata(), await grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with: status = StatusCode.UNKNOWN details = "Unexpected <class 'grpc.aio._call.AioRpcError'>: <AioRpcError of RPC that terminated with: status = StatusCode.UNKNOWN details = "Stream removed" debug_error_string = "{"created":"@1657647462.843032575","description":"Error received from peer ipv4:54.241.53.116:443","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Stream removed","grpc_status":2}"

" debug_error_string = "{"created":"@1657647462.847747911","description":"Error received from peer ipv4:0.0.0.0:51253","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Unexpected <class 'grpc.aio._call.AioRpcError'>: <AioRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNKNOWN\n\tdetails = "Stream removed"\n\tdebug_error_string = "{"created":"@1657647462.843032575","description":"Error received from peer ipv4:54.241.53.116:443","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Stream removed","grpc_status":2}"\n>","grpc_status":2}"

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/lib/python3.9/threading.py", line 980, in _bootstrap_inner self.run() File "/home/ubuntu/my_env_p3.9/lib/python3.9/site-packages/jina/helper.py", line 1292, in run self.result = asyncio.run(func(*args, *kwargs)) File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "/usr/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete return future.result() File "/home/ubuntu/my_env_p3.9/lib/python3.9/site-packages/jina/clients/mixin.py", line 176, in _get_results async for resp in c._get_results(args, **kwargs): File "/home/ubuntu/my_env_p3.9/lib/python3.9/site-packages/jina/clients/base/grpc.py", line 134, in _get_results raise BadClient(msg) from err jina.excepts.BadClient: gRPC error: StatusCode.UNKNOWN Unexpected <class 'grpc.aio._call.AioRpcError'>: <AioRpcError of RPC that terminated with: status = StatusCode.UNKNOWN details = "Stream removed" debug_error_string = "{"created":"@1657647462.843032575","description":"Error received from peer ipv4:54.241.53.116:443","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Stream removed","grpc_status":2}"


AttributeError Traceback (most recent call last) File ~/my_env_p3.9/lib/python3.9/site-packages/jina/helper.py:1307, in run_async(func, *args, **kwargs) 1306 try: -> 1307 return thread.result 1308 except AttributeError:

AttributeError: '_RunThread' object has no attribute 'result'

During handling of the above exception, another exception occurred:

BadClient Traceback (most recent call last) Input In [9], in <cell line: 16>() 13 f = Flow().add( 14 uses='jinahub+sandbox://CLIPTorchEncoder', install_requirements = True, name='CTE' 15 ) 16 with f: ---> 17 f.post(on='/', inputs=q) 18 q.summary()

File ~/my_env_p3.9/lib/python3.9/site-packages/jina/clients/mixin.py:185, in PostMixin.post(self, on, inputs, on_done, on_error, on_always, parameters, target_executor, request_size, show_progress, continue_on_error, return_responses, kwargs) 182 if return_results: 183 return result --> 185 return run_async( 186 _get_results, 187 inputs=inputs, 188 on_done=on_done, 189 on_error=on_error, 190 on_always=on_always, 191 exec_endpoint=on, 192 target_executor=target_executor, 193 parameters=parameters, 194 request_size=request_size, 195 kwargs, 196 )

File ~/my_env_p3.9/lib/python3.9/site-packages/jina/helper.py:1311, in run_async(func, *args, **kwargs) 1308 except AttributeError: 1309 from jina.excepts import BadClient -> 1311 raise BadClient( 1312 'something wrong when running the eventloop, result can not be retrieved' 1313 ) 1314 else: 1316 raise RuntimeError( 1317 'you have an eventloop running but not using Jupyter/ipython, ' 1318 'this may mean you are using Jina with other integration? if so, then you ' 1319 'may want to use Client/Flow(asyncio=True). If not, then ' 1320 'please report this issue here: https://github.com/jina-ai/jina' 1321 )

BadClient: something wrong when running the eventloop, result can not be retrieved

numb3r3 commented 2 years ago

@sk-haghighi Regarding to your questions,

(1) the input to the with f: should be d right?

you are right, the input should be d

(2) second {p} this looks like you are creating a set of classes to return the relevant similarity scores for each class what is the purpose of having "uri='/path/to/room.png'" here? since I see you have a uri='rerank.png' on top.

that's my fault. The url='/path/to/room.png' should be removed. The matched documents should be textual doc, rather than an image.

(3) also I think d for you is the input, if this is the case don't we need to apply any input preprocessing before sending the image to the flow?

Yes, the preprocess is not needed anymore, which has been integrated into the executor. That's the reason why your code example above cannot work.

Try this out, it should work:

q = DocumentArray(
    [Document(uri='tests/img/00000.jpg', text='this is a test')
         .load_uri_to_image_tensor()
     ])
f = Flow().add(
    uses='jinahub+sandbox://CLIPTorchEncoder', install_requirements=True, name='CTE'
)
with f:
    f.post(on='/', inputs=q)
    q.summary()

Anyway, the correct code snippet for zero-shot classification should look like:

from jina import Flow
from docarray import Document

d = Document(
    uri='https://picsum.photos/300',
    matches=[
        Document(text=f'a photo of a {p}')
        for p in (
            'control room',
            'lecture room',
            'conference room',
            'podium indoor',
            'television studio',
        )
    ],
)

f = Flow().add(
    uses='jinahub+sandbox://CLIPTorchEncoder',
)
with f:
    r = f.post(on='/rank', inputs=d)
    print(r['@m', ['text', 'scores__clip_score__value']])

Thank you for your patience!