jina-ai / serve

☁️ Build multimodal AI applications with cloud-native stack
https://jina.ai/serve
Apache License 2.0
21.13k stars 2.22k forks source link

document's `integer` tags becomes `float` in executor #5618

Closed solumnsilence closed 1 year ago

solumnsilence commented 1 year ago

Describe the bug When I send document array to my executor for vectorization, all integer tags becomes float.

Code with test:

from docarray import DocumentArray, Document
​
da = DocumentArray([
    Document(tags={'id': 1}, text='something'),
    Document(tags={'id': 2}, text='something'),
    Document(tags={'id': 3}, text='something'),
])
​
for doc in da:
    print(doc.tags)
​
da_with_embeddings = da.post(host=f'http://localhost:5345', show_progress=True)
​
for doc in da_with_embeddings:
    print(doc.tags)

Environment

- jina 3.10.0
- docarray 0.20.1
- jcloud 0.2.0
- jina-hubble-sdk 0.30.4
- jina-proto 0.1.13
- protobuf 4.21.12
- proto-backend upb
- grpcio 1.47.2
- pyyaml 6.0
- python 3.7.9
- platform Linux
- platform-release 5.15.0-1025-gcp
- platform-version #32~20.04.2-Ubuntu SMP Tue Nov 29 08:31:04 UTC 2022
- architecture x86_64
- processor
- uid 2485377957892
- session-id 7c4373d6-9b52-11ed-8952-0242ac120004
- uptime 2023-01-23T19:16:46.168823
- ci-vendor (unset)
- internal False
* JINA_DEFAULT_HOST (unset)
* JINA_DEFAULT_TIMEOUT_CTRL (unset)
* JINA_DEPLOYMENT_NAME (unset)
* JINA_DISABLE_UVLOOP (unset)
* JINA_EARLY_STOP (unset)
* JINA_FULL_CLI (unset)
* JINA_GATEWAY_IMAGE (unset)
* JINA_GRPC_RECV_BYTES (unset)
* JINA_GRPC_SEND_BYTES (unset)
* JINA_HUB_NO_IMAGE_REBUILD (unset)
* JINA_LOG_CONFIG (unset)
* JINA_LOG_LEVEL CRITICAL
* JINA_LOG_NO_COLOR (unset)
* JINA_MP_START_METHOD (unset)
* JINA_OPTOUT_TELEMETRY (unset)
* JINA_RANDOM_PORT_MAX (unset)
* JINA_RANDOM_PORT_MIN (unset)

Screenshots Test result: image

JoanFM commented 1 year ago

Seems related to #3343

JoanFM commented 1 year ago

This is the underlying issue on DocArray:

da = DocumentArray([
    Document(tags={'id': 1}, text='something'),
    Document(tags={'id': 2}, text='something'),
    Document(tags={'id': 3}, text='something'),
])

for doc in da:
    print(doc.tags)

da.save_binary('aux.bin', protocol='protobuf')

da_loaded = DocumentArray.load_binary('aux.bin', protocol='protobuf')

for doc in da_loaded:
    print(doc.tags)
JoanFM commented 1 year ago

Will open an issue on DocArray repo