jina-ai / serve

☁️ Build multimodal AI applications with cloud-native stack
https://jina.ai/serve
Apache License 2.0
21.13k stars 2.22k forks source link

running tensorflow executor replicas under the same strategy #4755

Closed nick-konovalchuk closed 2 years ago

nick-konovalchuk commented 2 years ago

link to my sandbox: https://github.com/bottledmind/jina-issues Describe the feature If you're creating TensorFlow models under the same strategy, their GPU memory usage is significantly reduced (app2.py from the sandbox). Unfortunately, creating, say, 5 replicas of an executor with the same TensorFlow model in them results in x5 GPU memory consumption (app.py from the sandbox).

Your proposal If running the replicas under the same strategy isn't possible at the moment, it would be a nice feature


Environment

Screenshots Screenshots of watch nvidia-smi app.py: image app2.py image

nick-konovalchuk commented 2 years ago

(tensorflow:latest-gpu container)

- jina 3.3.15
- docarray 0.13.7
- jina-proto 0.1.8
- jina-vcs-tag (unset)
- protobuf 3.19.4
- proto-backend cpp
- grpcio 1.43.0
- pyyaml 6.0
- python 3.8.10
- platform Linux
- platform-release 4.15.0-173-generic
- platform-version #182-Ubuntu SMP Fri Mar 18 15:53:46 UTC 2022
- architecture x86_64
- processor x86_64
- uid 2485377892355
- session-id 32f4bd9e-cc94-11ec-9762-0242ac110003
- uptime 2022-05-05T16:55:42.277999
- ci-vendor (unset)
* JINA_DEFAULT_HOST (unset)
* JINA_DEFAULT_TIMEOUT_CTRL (unset)
* JINA_DEFAULT_WORKSPACE_BASE /root/.jina/executor-workspace
* JINA_DEPLOYMENT_NAME (unset)
* JINA_DISABLE_UVLOOP (unset)
* JINA_FULL_CLI (unset)
* JINA_GATEWAY_IMAGE (unset)
* JINA_GRPC_RECV_BYTES (unset)
* JINA_GRPC_SEND_BYTES (unset)
* JINA_HUBBLE_REGISTRY (unset)
* JINA_HUB_CACHE_DIR (unset)
* JINA_HUB_NO_IMAGE_REBUILD (unset)
* JINA_HUB_ROOT (unset)
* JINA_LOG_CONFIG (unset)
* JINA_LOG_LEVEL (unset)
* JINA_LOG_NO_COLOR (unset)
* JINA_MP_START_METHOD (unset)
* JINA_RANDOM_PORT_MAX (unset)
* JINA_RANDOM_PORT_MIN (unset)
* JINA_VCS_VERSION (unset)
JoanFM commented 2 years ago

But these Models live in the same process, while for replicas they would live in different processer, or even different containers or machines.

I really doubt this feature can be implemented