为什么使用spawn方法来实现streamer

ShannonAI / service-streamer

Boosting your Web Services of Deep Learning Applications.

Apache License 2.0

1.23k stars 186 forks source link

为什么使用spawn方法来实现streamer #35

Closed starplanet closed 4 years ago

starplanet commented 4 years ago

在service_streamer.py中

mp = multiprocessing.get_context('spawn')

为什么要加spawn，这导致无法使用preload特性，每个进程会被复制一份内存，无法共享程序初始化过程中的全局变量。

手动改成fork模式，程序也正常运行。不知道为什么需要特意加上这个？

Meteorix commented 4 years ago

@starplanet 这是因为pytorch的限制，我猜tf也是一样，你改fork之后用了gpu模型吗？

starplanet commented 4 years ago

是的。我用的tensorflow的 SavedModel格式，在2个GPU上测试，用来预测没有问题，并解决了preload问题

Meteorix commented 4 years ago

好的，你有兴趣提一个pr吗？提供一个api设置fork/spawn，默认还是用spawn。因为service-streamer不想知道当前用的什么框架，所以把api暴露给用户

Meteorix commented 4 years ago

另外，欢迎提pr新建一个example_tf目录，用作示例

starplanet commented 4 years ago

我当前fork了这个项目，目前只是简单的使用import multiprocessing as mp，然后将上面那句去掉了。等我忙完手头的项目，再看看怎么弄，因为目前还没想好这个api该怎么设计

Meteorix commented 4 years ago

closed by this https://github.com/ShannonAI/service-streamer/pull/44

zhongbin1 commented 4 years ago

@starplanet @Meteorix 两位好，请教一个问题。我使用tensorflow的SavedModel格式加载predict推理时，每个GPU分配两个worker，最终只会启动一个worker，请问这是什么问题？

tf报错如下：tensorflow.python.framework.errors_impl.InternalError: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_INVALID_DEVICE: invalid device ordinal