roboflow / inference

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
https://inference.roboflow.com
Other
1.38k stars 131 forks source link

Fix: CUDA context was failing to load in child-process #836

Closed PawelPeczek-Roboflow closed 1 day ago

PawelPeczek-Roboflow commented 1 day ago

Description

We had a bug that was making impossible using InferencePipeline inside inference server with CUDA support - this is a side effect of well-known problem with CUDA and Python multiprocessing. Default process start method inside our container is fork, but spawn is recommended to make CUDA work.

We do not use basically any of parent resources in stream manager APP (managing separate InferencePipeline processes inside inference server - so for manager process - and all its children we set spawn not affecting the server process itself). Thanks to this change, we can successfully spawn multiple downstream InferencePipeline processes.

As far as I tested, everything works as it should, but we need to investigate the issue over time, as the change may have minor side-effects that we do not see now, for instance in the nuances of the workflows behaviour

[!CAUTION] One negative side effect is that spawning each and every downstream process with InferencePipeline inside inference server now takes 15s - and this is known side-effect, yet the scale in our case is really big 😢 unfortunately - this seems to be a way to go in our setup.

[!IMPORTANT] Since change in process start method from fork to spawn introduced so much latency, introduced daemon thread that keeps at least n idle processes ready to serve as pipeline workers - this way, latency on user end is minimised to model load time and camera connection - ~2s instead of over 15s

Type of change

Please delete options that are not relevant.

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs