Open amrahsmaytas opened 3 years ago
@wat3rBro It is taking 4 seconds to predict a frame in raspberry pi 4 .
I am using a Raspberry Pi 3B+.
With faster_rcnn_fbnetv3a_dsmask_C4.yaml
inference on one frame took ~10s.
With downscaling from 320px to 160px I got down to ~5s.
With downscaling and int8 quantization I got down to ~0.5s.
For downscaling I used d2go.utils.demo_predictor.DemoPredictor
with min_size_test=112
and max_size_test=160
.
For quantization I followed https://github.com/facebookresearch/d2go/blob/main/demo/d2go_beginner.ipynb
The default quantization engine did not work so I used
# https://github.com/pytorch/android-demo-app/issues/104
config = d2go.model_zoo.model_zoo.get_config('faster_rcnn_fbnetv3a_dsmask_C4.yaml')
config.QUANTIZATION.BACKEND = 'qnnpack'
for saving and
# https://github.com/pytorch/pytorch/issues/29327#issue-518778762
torch.backends.quantized.engine = 'qnnpack'
for loading.
Hey @aleannox , thanks for sharing :)
Can you also share some more Information such as
did you see any accuracy drop after downscaling + int8 Quantization?
If yes, how much?
Also, how's your pi3b+ running, I mean what's the ram, memory and temp stats looks like.
Also, Curious to know, if you also tried some dl compilers such as apache tvm etc to optimise d2go for pi3b+ hardware?
Thanks, Satyam.
Hey @amrahsmaytas,
Sure, happy to share :) Actually the 1s I mentioned was too pessimistic, I actually see ~0.5s inference time with downscaling and quantization.
I did not perform a quantitative analysis of accuracy. I am using the model to detect persons and for this purpose I did not notice an performance drop.
The usage stats of my Pi with inference running are
I did not try dl compilers, for my purpose the 0.5s are sufficient.
Hope this helps :)
Hey @amrahsmaytas,
Sure, happy to share :) Actually the 1s I mentioned was too pessimistic, I actually see ~0.5s inference time with downscaling and quantization.
I did not perform a quantitative analysis of accuracy. I am using the model to detect persons and for this purpose I did not notice an performance drop.
The usage stats of my Pi with inference running are
- 200% CPU for the process (of 4 cores)
- 30% RAM for the process (of 873MB)
- 83C temp
I did not try dl compilers, for my purpose the 0.5s are sufficient.
Hope this helps :)
Yup, thanks for sharing 😃✌🏻
Hey @amrahsmaytas,
Sure, happy to share :) Actually the 1s I mentioned was too pessimistic, I actually see ~0.5s inference time with downscaling and quantization.
I did not perform a quantitative analysis of accuracy. I am using the model to detect persons and for this purpose I did not notice an performance drop.
The usage stats of my Pi with inference running are
- 200% CPU for the process (of 4 cores)
- 30% RAM for the process (of 873MB)
- 83C temp
I did not try dl compilers, for my purpose the 0.5s are sufficient.
Hope this helps :)
Can you also let me know, whether the raspberry os you used on your pi3b is 64bit or 32bit?
Do you have information about segmentation too?
And, can you also please share the code to my mail greetsatyamsharma@gmail.com , I would be really thankfull 😌 😃
Thanks in advance, Satyam
I am using a Raspberry Pi 3B+.
With
faster_rcnn_fbnetv3a_dsmask_C4.yaml
inference on one frame took ~10s. With downscaling from 320px to 160px I got down to ~5s. With downscaling and int8 quantization I got down to ~0.5s.For downscaling I used
d2go.utils.demo_predictor.DemoPredictor
withmin_size_test=112
andmax_size_test=160
. For quantization I followed https://github.com/facebookresearch/d2go/blob/main/demo/d2go_beginner.ipynb The default quantization engine did not work so I used# https://github.com/pytorch/android-demo-app/issues/104 config = d2go.model_zoo.model_zoo.get_config('faster_rcnn_fbnetv3a_dsmask_C4.yaml') config.QUANTIZATION.BACKEND = 'qnnpack'
for saving and
# https://github.com/pytorch/pytorch/issues/29327#issue-518778762 torch.backends.quantized.engine = 'qnnpack'
for loading.
can u share more details ? 64 bit or 32 bit os if possible can u share code to ? shivarajmahesh11@gmail.com
Hi guys
I am using a 32bit OS. I have not tried segmentation because I don't need it for my project. And you can find my code here: https://github.com/aleannox/leo/blob/main/vision.py
Cheers
Hi guys
I am using a 32bit OS. I have not tried segmentation because I don't need it for my project. And you can find my code here: https://github.com/aleannox/leo/blob/main/vision.py
Cheers
Thanks @aleannox 😃
I don't see faster_rcnn_fbnetv3a_dsmask_C4.yaml in the model zoo, and I've had some trouble training it well, how should it compare to faster_rcnn_fbnetv3g_fpn.yaml?
🚀 Feature
Optimising d2go models for rasberry pi along with multi threading in order to use all the available 4 cores on rasberry pi
Motivation & Examples
I had tried running d2go , qat optimised version on a rasberry pi 4 which has a Arm-Neon architecture, but when I read about the qat backend as per their official repo, it's only supported for mobile arm architecture, thus the inference speed on the rasberry pi running the qat optimised model with backend "qnnpack" Gives me output roughly around 3-4 seconds , but I wanna achieve the mentioned speed in d2go repo which is 0.005seconds ( 50 milliseconds) for its pre-trained models (will look into custom models, once it's done).
Can anyone guide me on optimising further the qat model for the Arm-Neon (rasberry architecture) or any other best way where I can optimise the above d2go model to achieve the speed mentioned in the d2go repo which is around 50 milliseconds
Describe what the feature would look like, if it is implemented.
If the feature gets implemented, it would less resource consuming and also lighting fast in terms of predicting output.
Looking forward to the community to help me out
Thanks
P.s: I was also thinking to experimenting with (detectron2) pytorch model --> onnx --> tensorflow --> tflite, but I am not sure whether that would work or not, and if did work, whether I would be able to the get the speed as mentioned in d2go official repo (50 milliseconds), would like to have suggestions on this part too.
@zhanghang1989 @petoor