roytseng-tw / Detectron.pytorch

A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.
MIT License
2.82k stars 567 forks source link

Convert all Proposal code to PyTorch #146

Open Rizhiy opened 5 years ago

Rizhiy commented 5 years ago

I have done some profiling and the proposal part of the model (Generalized_RCNN.RPN) takes nearly 20% of the time during GPU inference for me.

I looked into it and found that there is still some numpy code in there, I think if this code is replaced with PyTorch GPU code, the inference can be significantly sped up on newer GPUs.

Rizhiy commented 5 years ago

I have done some more tests and it appears that most of the time is spent on transfer since these layers take significantly less time during CPU inference than during GPU inference.