dbolya / yolact

A simple, fully convolutional model for real-time instance segmentation.
MIT License
4.98k stars 1.32k forks source link

Best way to use YOLACT with a camera (how to use YOLACT in production)? #315

Open pieterblok opened 4 years ago

pieterblok commented 4 years ago

@dbolya Daniel, thank you very much for this nice new CNN! Thanks to your and your co-developers we can deploy instance segmentation much better on small robots (with lighter hardware, like Jetsons). Really appreciate your work, because it can have a great impact on our robotics application!

I'm used to work with Matterport's version of Mask-RCNN. We use their code on a robot, but we want to move to YOLACT for the above mentioned reasons. Unfortunately I'm not that familiar with Pytorch, and I have the following struggle:

How can I best use YOLACT in combination with a streaming camera?

From the demo-code of Matterport (https://github.com/matterport/Mask_RCNN/blob/master/samples/demo.ipynb) it's a pretty easy procedure to first load the network (step 1-4 in their Jupyter Notebook) and then use the code of step 5 in a loop to acquire images and then immediately process them with this single line of code: results = model.detect([image], verbose=1)

I understand I have to use the eval.py procedure of YOLACT, but how can I best implement YOLACT in a similar manner as we currently do (so first load the network, parameters, and classes and then use a "detect" line(s) of code to process the images from the camera in a loop?). Please excuse me for this question as it might be super simple (I don't know where to start).

sdimantsd commented 4 years ago

Hi @pieterbl86 im not one of the developers (unfortunately). But look into the function prep_display in eval.py, especially on those lines

    with timer.env('Copy'):
        if cfg.eval_mask_branch:
            # Masks are drawn on the GPU, so don't copy
            masks = t[3][:args.top_k]
        classes, scores, boxes = [x[:args.top_k].cpu().numpy() for x in t[:3]]

Which Gatson would you like to use?

pieterblok commented 4 years ago

Hi @pieterbl86 im not one of the developers (unfortunately). But look into the function prep_display in eval.py, especially on those lines

    with timer.env('Copy'):
        if cfg.eval_mask_branch:
            # Masks are drawn on the GPU, so don't copy
            masks = t[3][:args.top_k]
        classes, scores, boxes = [x[:args.top_k].cpu().numpy() for x in t[:3]]

Which Gatson would you like to use?

@sdimantsd Thank you very much. I just created another function using the main code from prep_display. The code works, thanks again.

I'm using a NVIDIA Jetson Xavier development board (512 cuda cores). With YOLACT++ I obtain an image inference speed of 5 FPS.

dbolya commented 4 years ago

@pieterbl86 This might be less of a drop-in replacement than @sdimantsd's suggestion, but if you want to try reconciling with it, you could also check out the video evaluation pipeline (run eval with --video=<camera_index>, code for it is in the evalvideo function). It tries to keep the fps stable and does some threading to squeeze as much performance out of the GPU as possible (by staggering the evaluation of data loading, data preparation, model evaluation, and drawing the frame [i.e., output lag of 4 frames, but 4 processes are always happening at once]). Not sure how much that would benefit your use case.

Out of curiosity, what FPS did you get with Mask R-CNN?

thimabru1010 commented 4 years ago

Hi @pieterbl86 im not one of the developers (unfortunately). But look into the function prep_display in eval.py, especially on those lines

    with timer.env('Copy'):
        if cfg.eval_mask_branch:
            # Masks are drawn on the GPU, so don't copy
            masks = t[3][:args.top_k]
        classes, scores, boxes = [x[:args.top_k].cpu().numpy() for x in t[:3]]

Which Gatson would you like to use?

@sdimantsd Thank you very much. I just created another function using the main code from prep_display. The code works, thanks again.

I'm using a NVIDIA Jetson Xavier development board (512 cuda cores). With YOLACT++ I obtain an image inference speed of 5 FPS.

How did u install pytorch on Jetson?? When I try to install torchvision it fails.Torch works well.

sdimantsd commented 4 years ago

@thimabru1010 Follow the introduction in this link: https://devtalk.nvidia.com/default/topic/1049071/jetson-nano/pytorch-for-jetson-nano-version-1-4-0-now-available/

lzyang2000 commented 4 years ago

@pieterbl86 I am also looking into using this on the xavier, and I wonder is any of the Nvidia optimizations like TensorRT or Jetpack utilized? Just curious if the nvidia stuff could make it faster. For reference here are some links: Pytorch with jetpack container TensorRT

Thanks!

xsidneib commented 3 years ago

Hi @lzyang2000 try exporting your model to ONNX and create a engine using TensorRT (in the xavier) with dla enabled.