jina-ai / jina

☁️ Build multimodal AI applications with cloud-native stack
https://docs.jina.ai
Apache License 2.0
20.64k stars 2.21k forks source link

Encoder is not working #4118

Closed jyotikhetan closed 2 years ago

jyotikhetan commented 2 years ago

I am trying example from jina blog but enocder is not working cause of that index is out of range. Please me to understand. https://docs.jina.ai/datatype/image/image2image/

error I am facing are : -

`Encoder@5558[E]:IndexError('list index out of range')
 add "--quiet-error" to suppress the exception details
Traceback (most recent call last):
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/peapods/runtimes/zmq/zed.py", line 285, in _msg_callback
    processed_msg = self._callback(msg)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/peapods/runtimes/zmq/zed.py", line 271, in _callback
    msg = self._post_hook(self._handle(self._pre_hook(msg)))
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/peapods/runtimes/zmq/zed.py", line 226, in _handle
    peapod_name=self.name,
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/peapods/runtimes/request_handlers/data_request_handler.py", line 165, in handle
    field='groundtruths',
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/executors/__init__.py", line 198, in __call__
    self, **kwargs
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/executors/decorators.py", line 106, in arg_wrapper
    return fn(*args, **kwargs)
  File "/home/jyoti/image_pipline/image_image.py", line 22, in predict
    embeds = self._embedder.predict(docs.get_attributes('uri'))
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/model.py", line 247, in wrapper
    result = func(self, *args, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/model.py", line 418, in predict
    x = data_pipeline.worker_preprocessor(running_stage, collate_fn=dataloader.collate_fn)(x)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/data/batch.py", line 231, in forward
    samples = self.collate_fn(samples, metadata)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/data/utils.py", line 178, in forward
    return self.func(*args, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/data/process.py", line 384, in collate
    return collate_fn(samples)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/data/utils.py", line 178, in forward
    return self.func(*args, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/data/transforms.py", line 118, in kornia_collate
    return default_collate(samples)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 46, in default_collate
    elem = batch[0]
IndexError: list index out of range`
JoanFM commented 2 years ago

The link does not seem to point to any valid end

jyotikhetan commented 2 years ago

https://docs.jina.ai/datatype/image/image2image/ I hope this should work..!!

makram93 commented 2 years ago

Hi @jyotikhetan,

I tried out the example in the blog, and with few changes, it seems to be working. Please use lightning-flash==0.5.0. You can find my working solution here

Thank you for raising this issue. Please let me know if you find any other problems in the working colab. Meanwhile, I will update the blog accordingly.

jyotikhetan commented 2 years ago

hi @makram93 I have tried that already... but still I having trouble..!!

 `   def predict(self, docs: DocumentArray, **kwargs):
        embeds = self._embedder.predict(docs.get_attributes('uri'))`

error log shows issue in this..!! Alos in colab pytorch_lighting version cann't import _PatchDataloader,, It's _Dataloader

makram93 commented 2 years ago

Hi @jyotikhetan,

Can you let me know the jina, pytorch-lightning & lightning-flash version? I have updated the colab, it should be working now. It's mainly due to the dependencies version problem. Let me know if there's still issue with it

jyotikhetan commented 2 years ago

hi @makram93

I was still getting the same error, after installing the version recommended in colab... So I ran the colab file itself... there also I am getting the same error.

Please help me understand , I am going wrong..

        Encoder@349[E]:IndexError('list index out of range')
 add "--quiet-error" to suppress the exception details
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/jina/peapods/runtimes/zmq/zed.py", line 285, in _msg_callback
    processed_msg = self._callback(msg)
  File "/usr/local/lib/python3.7/dist-packages/jina/peapods/runtimes/zmq/zed.py", line 271, in _callback
    msg = self._post_hook(self._handle(self._pre_hook(msg)))
  File "/usr/local/lib/python3.7/dist-packages/jina/peapods/runtimes/zmq/zed.py", line 226, in _handle
    peapod_name=self.name,
  File "/usr/local/lib/python3.7/dist-packages/jina/peapods/runtimes/request_handlers/data_request_handler.py", line 171, in handle
    field='groundtruths',
  File "/usr/local/lib/python3.7/dist-packages/jina/executors/__init__.py", line 204, in __call__
    self, **kwargs
  File "/usr/local/lib/python3.7/dist-packages/jina/executors/decorators.py", line 106, in arg_wrapper
    return fn(*args, **kwargs)
  File "<ipython-input-6-82e6780722f0>", line 13, in predict
    embeds = self._embedder.predict(docs.get_attributes('uri'))
  File "/usr/local/lib/python3.7/dist-packages/flash/core/model.py", line 247, in wrapper
    result = func(self, *args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/flash/core/model.py", line 415, in predict
    dataset = data_pipeline.data_source.generate_dataset(x, running_stage)
  File "/usr/local/lib/python3.7/dist-packages/flash/core/data/data_source.py", line 326, in generate_dataset
    is_none = data[0] is None
IndexError: list index out of range`

Thank you

makram93 commented 2 years ago

@jyotikhetan Can you share your colab with the reproducible error? It will be easier for me take a look into your code.

JoanFM commented 2 years ago

It seems @jyotikhetan that docs.get_attributes('uri') may not be actually returning uris. Can u print the result of doing:

docs.get_attributes('uri') in that executor?

jyotikhetan commented 2 years ago

@jyotikhetan Can you share your colab with the reproducible error? It will be easier for me take a look into your code.

Hi @makram93 , One I shared is from colab file , itself..!! Thank you

It seems @jyotikhetan that docs.get_attributes('uri') may not be actually returning uris. Can u print the result of doing:

docs.get_attributes('uri') in that executor?

hi @JoanFM

`    def predict(self, docs: DocumentArray, **kwargs):
        print(docs.get_attributes('uri'))
        embeds = self._embedder.predict(docs.get_attributes('uri'))
        print(embeds)
        for doc, embed in zip(docs, embeds):
            doc.embedding = embed.numpy()`

so I am getting the URI of files which i tried indexing correctly ['/home/jyoti/image_pipline/test2/7.jpg', '/home/jyoti/image_pipline/test2/5.jpg', '/home/jyoti/image_pipline/test2/8.jpg', '/home/jyoti/image_pipline/test2/2.jpg',

also got the corresponding embedding by printing embeds. when I tried to search using swagger UI , it's throwing the error as

`        Encoder@4432[E]:IndexError('list index out of range')
 add "--quiet-error" to suppress the exception details
Traceback (most recent call last):
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/peapods/runtimes/zmq/zed.py", line 285, in _msg_callback
    processed_msg = self._callback(msg)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/peapods/runtimes/zmq/zed.py", line 271, in _callback
    msg = self._post_hook(self._handle(self._pre_hook(msg)))
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/peapods/runtimes/zmq/zed.py", line 226, in _handle
    peapod_name=self.name,
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/peapods/runtimes/request_handlers/data_request_handler.py", line 165, in handle
    field='groundtruths',
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/executors/__init__.py", line 198, in __call__
    self, **kwargs
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/executors/decorators.py", line 106, in arg_wrapper
    return fn(*args, **kwargs)
  File "/home/jyoti/image_pipline/imageot/image_image.py", line 23, in predict
    embeds = self._embedder.predict(docs.get_attributes('uri'))
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/model.py", line 247, in wrapper
    result = func(self, *args, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/model.py", line 415, in predict
    dataset = data_pipeline.data_source.generate_dataset(x, running_stage)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/data/data_source.py", line 326, in generate_dataset
    is_none = data[0] is None
IndexError: list index out of range
`
JoanFM commented 2 years ago

Well, the error seems to be clearly here ,

So can u please share the exact steps and the exact logs extracted (not only the error, the complete working steps)

jyotikhetan commented 2 years ago

hi @JoanFM

This is complete code from the link shared

from jina import DocumentArray
from jina.types.document.generators import from_files

docs_array = DocumentArray(from_files(f"/home/jyoti/image_pipline/test2/*.jpg"))

from jina import DocumentArray, Executor, requests
from flash.image import ImageEmbedder

class FlashImageEncoder(Executor):

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self._embedder = ImageEmbedder(embedding_dim=1024)

    @requests
    def predict(self, docs: DocumentArray, **kwargs):
        print(docs.get_attributes('uri'))
        embeds = self._embedder.predict(docs.get_attributes('uri'))
        print(embeds)
        for doc, embed in zip(docs, embeds):
            doc.embedding = embed.numpy()

from jina import DocumentArrayMemmap, DocumentArray, Executor, requests

class SimpleIndexer(Executor):

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self._dam = DocumentArrayMemmap(self.workspace)

    @requests(on='/index')
    def index(self, docs: DocumentArray, **kwargs):
        self._dam.extend(docs)

    @requests(on='/search')
    def search(self, docs: DocumentArray, **kwargs):
        docs.match(self._dam)

from jina import Flow

f = (
    Flow
        (cors=True, port_expose=12345, protocol="http")
        .add(uses='FlashImageEncoder', name="Encoder")
        .add(uses=SimpleIndexer, name="Indexer")

)
# docs_array = DocumentArray(from_files(f"/home/jyoti/image_pipline/test/*.jpg"))
with f:
    f.post('/index', docs_array)
    f.block()

from jina import Client, Document
from jina.types.request import Response

def print_matches(resp: Response):  # the callback function invoked when task is done
    for idx, d in enumerate(resp.docs[0].matches[:3]):  # print top-3 matches
        print(f'[{idx}]{d.scores["euclidean"].value:2f}: "{d.text}"')

c = Client(protocol='http', port=12345)  # connect to localhost:12345
c.post('/search', Document(uri='/home/jyoti/image_pipline/test2/1.jpg'), on_done=print_matches)

I am using swagger UI so see the results , Indexing is done correctly after that it's throwing this error

`        Encoder@20316[E]:IndexError('list index out of range')
 add "--quiet-error" to suppress the exception details
Traceback (most recent call last):
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/peapods/runtimes/zmq/zed.py", line 285, in _msg_callback
    processed_msg = self._callback(msg)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/peapods/runtimes/zmq/zed.py", line 271, in _callback
    msg = self._post_hook(self._handle(self._pre_hook(msg)))
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/peapods/runtimes/zmq/zed.py", line 226, in _handle
    peapod_name=self.name,
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/peapods/runtimes/request_handlers/data_request_handler.py", line 165, in handle
    field='groundtruths',
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/executors/__init__.py", line 198, in __call__
    self, **kwargs
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/jina/executors/decorators.py", line 106, in arg_wrapper
    return fn(*args, **kwargs)
  File "/home/jyoti/image_pipline/imageot/image_image.py", line 23, in predict
    embeds = self._embedder.predict(docs.get_attributes('uri'))
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/model.py", line 247, in wrapper
    result = func(self, *args, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/model.py", line 418, in predict
    x = data_pipeline.worker_preprocessor(running_stage, collate_fn=dataloader.collate_fn)(x)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/data/batch.py", line 231, in forward
    samples = self.collate_fn(samples, metadata)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/data/utils.py", line 178, in forward
    return self.func(*args, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/data/process.py", line 384, in collate
    return collate_fn(samples)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/data/utils.py", line 178, in forward
    return self.func(*args, **kwargs)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/flash/core/data/transforms.py", line 118, in kornia_collate
    return default_collate(samples)
  File "/home/jyoti/image_pipline/image_pipenv/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 46, in default_collate
    elem = batch[0]
IndexError: list index out of range
`

Thank you

JoanFM commented 2 years ago

Without Swagger UI, what is the result of running this part:

c = Client(protocol='http', port=12345)  # connect to localhost:12345
c.post('/search', Document(uri='/home/jyoti/image_pipline/test2/1.jpg'), on_done=print_matches)
jyotikhetan commented 2 years ago

hi @JoanFM

I am getting empty string as

`/home/jyoti/image_pipline/image_pipenv/bin/python /home/jyoti/image_pipline/imageot/app.py
[0]0.000000: ""
[1]0.000000: ""
[2]0.000000: ""

Process finished with exit code 0`
JoanFM commented 2 years ago

hi @JoanFM

I am getting empty string as

`/home/jyoti/image_pipline/image_pipenv/bin/python /home/jyoti/image_pipline/imageot/app.py
[0]0.000000: ""
[1]0.000000: ""
[2]0.000000: ""

Process finished with exit code 0`

Hey @jyotikhetan,

then the problem is another, the problem that you post is some error that may be due to a wrong usage of the SwaggerUI.

So let's focus first on this "" results. These results are normal.

Please pay attention at your print_matches function:

jyotikhetan commented 2 years ago

hi @JoanFM ,

Yeah it's working now..Please updated in doc also. https://docs.jina.ai/datatype/image/image2image/ .

Thank you