Add GPU video encoder, rankers and segmenters - Githubissues

jina-ai / executors

internal-only

Apache License 2.0

31 stars 12 forks source link

Add GPU video encoder, rankers and segmenters #169

Closed jakobkruse1 closed 3 years ago

jakobkruse1 commented 3 years ago

There are still some GPU-supporting executors missing. This issue finishes the GPU support for the executors. If there are more executors which are missing GPU support, add them to the list.

Video Encoders:

[x] VideoTorchEncoder

Segmenters:

[x] TorchObjectDetectionSegmenter
[ ] ~~VADSpeechSegmenter~~ (not possible)
[ ] ~~YoloV5Segmenter~~ (postponed)

Rankers:

[x] DPRReaderRanker (in review)

jacobowitz commented 3 years ago

I dont think YoloV5Segmenter is actually missing GPU support. It can already be run on GPU, but there is no distinction between cpu/gpu image. We could add it, by forcing CPU version of Pytorch, but I think its not a good idea as it is only an indirect dependency added by YoloV5. As far as I see Yolov5 does not support cpu only

tadejsv commented 3 years ago

Why does it not support CPU only? Can you not do model.to('cpu')?

jacobowitz commented 3 years ago

sure, I mean dependency wise. There is no yolov5[cpu]or so. yolov5 always uses the full torch version with gpu support. So if we want to split into cpu/gpu versions, we would need to override pytorch in our requirements. We can certainty do so, but the issue is that we need to change also our requirements then if they change the upstream requirements which is a bit annoying?

tadejsv commented 3 years ago

Since we pin our requirements to an exact version, we won't have this problem unless we are changing our requirements intentionally, so I think this is managable. I would still go ahead and create a CPU-only version

jacobowitz commented 3 years ago

VADSpeechSegmenter can not run on GPU actually. I tried to add support and the inference fails with this message

RuntimeError: Could not run 'quantized::linear_dynamic' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'quantized::linear_dynamic' is only available for these backends: [CPU, BackendSelect, Named, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, UNKNOWN_TENSOR_TYPE_ID, AutogradMLC, Tracer, Autocast, Batched, VmapMode]

Also there is https://github.com/pytorch/pytorch/issues/42288

jacobowitz commented 3 years ago

DPRReaderRanker: https://github.com/jina-ai/executors/pull/198 TorchObjectDetectionSegmenter: https://github.com/jina-ai/executors/pull/199 VideoTorchEncoder: https://github.com/jina-ai/executors/pull/180

jacobowitz commented 3 years ago

I suggest to remove the Yolov5Segenter from the scope of this ticket and close it once the related PRs are merged. For yolov5 I've created a separate issue here https://github.com/jina-ai/executors/issues/200