Open oke-aditya opened 4 years ago
@pmeier I added this to discussion here. Maybe it will need some time and more thorough thought.
@pmeier Any thoughts or updates ? I guess it would be important feature addition. This will allow to train future models in object detection, keypoint detection and semantic segmentation over TPU.
Hi,
I don't think that we currently support TPUs for the detection models. I believe part of the difficulty lies in the fact that these models have dynamic shapes, which are not very well-suited for TPUs. Additionally, we also have custom ops in torchvision for those models, which I don't think have a direct TPU mapping.
@ailzhang can you chime in with more details?
Hi for the custom ops implemented in torchvision with only CPU & CUDA impl (instead of pytorch native ops), we currently support them also as custom ops in pytorch/xla upon request. For example, we added nms
https://github.com/pytorch/xla/blob/d5b0b4e077496bb5cfaf823cc07f6f371b1a2af6/torch_xla/core/functions.py#L87 support upon user request. Feel free to open feature requests in pytorch/xla for other ops, we'll put them in our todo list to implement.
As @fmassa pointed out. It might not be suitable for TPUs, for dynamic shapes. I have a few doubts though.
Assume that user has fixed size of images, (through transforms or preprocessing) and feeds it as input to detection models. Will that also not be suited for TPUs?
Is it because torchvision uses GeneralizedRCNNTransform() for detection models?
Will it be advantageous and feasible to support the additional torchvision ops using xla that are needed for detection? Can we add them to enable TPU support?
Currently box_ops.batched_nms
is not supported I guess, and hence I'm getting the error.
@ofekp
I guess best would be to raise an issue to XLA team regarding this. Maybe they will add these ops to XLA compatibility.
I'm still unsure how much it will benefit while training as fmassa pointed out.
@oke-aditya I opened and issue in pytorch/xla - https://github.com/pytorch/xla/issues/2487.
This would be useful not just for training but also inference on google coral tpu. There is an xla version of nms but how does it work with torchvision? I have tried to adapt a forked version of torchvision but was unable to get it to work - it just hangs.
❓ Torchvision object detection models with TPU.
My doubt lies somewhere between feature request and question. hence posting here.
PyTorch supports TPU through torch_xla. It makes it possible to train models over TPU. I guess most torchvision classification models can be used with transfer learning/training over TPU.
For torchvision object detection models, do they support TPU? Some operations such as
NMS
,rpn
,roi_align
do not support TPU and hence I get an error as follows.I was trying Faster R-CNN resnet50 fpn model for object detection.
My doubts/concerns/feature request.