marqo-ai / marqo

Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
https://www.marqo.ai/
Apache License 2.0
4.6k stars 190 forks source link

Documents that fail to index due to size should raise a warning or error [ENHANCEMENT] #396

Open OwenPendrighElliott opened 1 year ago

OwenPendrighElliott commented 1 year ago

Is your feature request related to a problem? Please describe. Adding a document that is too large fails too silently. When there is an attempt to add a document that exceeds the maximum size no errors or warnings are raised in the Python client or the Docker logs, the only place the error is logged is in the response body.

Describe the solution you'd like

Additional context Below is an example to illustrate the behaviour:

import marqo

mq = marqo.Client(url="http://localhost:8882")

long_document = "0" * 100000

response = mq.index("my-index").add_documents(
    [{"Title": "my-document", "Description": long_document}],
)

print("Response from indexing:", response)
print()
print("Index stats:", mq.index("my-index").get_stats())
jalajk24 commented 1 year ago

@OwenPendrighElliott can i work on this issue and can please drop some insight for the same as i am new to it

OwenPendrighElliott commented 1 year ago

Hi @jalajk24, I will defer to @pandu-k for insight into the code changes themselves.

To get set up you can read about contributing and follow our dev setup instructions. The code for the Python client can be found in the py-marqo repo, you can follow the setup instructions there as well.

The example I provided in the issue should let you reproduce the behaviour I described.