Open sdimantsd opened 4 years ago
Hmm never seen that before. You've compiled and set up DCN right?
@chongzhou96 have you seen this before?
Yes, but when i compiling using python3 setup.py build develop
I get premission error, so i compile using sudo python setup.py build develop
But i don't think it's a problem, no?
Hmm what happens if you run python
then in the shell try to import dcn_v2
?
The import works
Hmm what's your pytorch version?
1.3.0 Running on Jetson nano
cuda 10.0.326
Ok it's not version specific then. I'll get back to you on this.
OK, Thank you very much!
@dbolya Do you have any news?
+1
1.3.0 Running on Jetson nano
I want to know how many fps yolact can achieve on jetson nano and how well it recognizes small targets. In the end, I can get the coordinates of the target rectangle or mask.Thanks in advance
1.3.0 Running on Jetson nano
I want to know how many fps yolact can achieve on jetson nano and how well it recognizes small targets. In the end, I can get the coordinates of the target rectangle or mask.Thanks in advance
Both question depends on the input size. What input size do you use?
With resnet 101 backbone: 700x700 take about 1.6 sec for one frame (~0.625 fps)
1.3.0 Running on Jetson nano
I want to know how many fps yolact can achieve on jetson nano and how well it recognizes small targets. In the end, I can get the coordinates of the target rectangle or mask.Thanks in advance
Both question depends on the input size. What input size do you use?
I mistakenly think that yolact cannot be used on jetson nano or TX2, because of computing power, I haven't tried it yet. What image size do you recommend and what speed and effect can you achieve?
The image size depends on you'r needed. if you'r objects are small you will need bigger input size. if not, you can use smaller input size. By YOLACT article the mAP on 550 and 700 it only 1.4% (29.8% vs 31.2%), but the diffrent in the FPS is bigger (33.5 vs 23.6)
With resnet 101 backbone: 700x700 take about 1.6 sec for one frame (~0.625 fps)
The image size depends on you'r needed. if you'r objects are small you will need bigger input size. if not, you can use smaller input size. By YOLACT article the mAP on 550 and 700 it only 1.4% (29.8% vs 31.2%), but the diffrent in the FPS is bigger (33.5 vs 23.6)
I want to get a background mask, such as a highway or lawn, to determine the coordinates of the boundary in autonomous driving. How many coordinates will the target mask output? Because I can't determine the boundary from four coordinates similar to a rectangular frame, because the background's boundary is irregular. I also have a question: how well does yolact recognize small targets (about tens to 200 pixels), thanks your response.
@chongzhou96 @dbolya Hi, anything new?
@sdimantsd Oops I asked @chongzhou96 to check in on this but I don't think he got anywhere. I'm going to tentatively say that the jetson you have there doesn't have enough CUDA kernels to run the custom deformable conv kernel code. Fixing this might require changing the deformable conv code to batch its calls better, which is not something I can really do easily.
Perhaps it would be useful to train a version of YOLACT++ without deformable convs? The other improvements would still give a benefit over the base version, and that should work on your GPU.
@dbolya is there an option to train the Yolact++ without deformable convs?
@abhigoku10 Simply replace the backbone (resnet101_dcn_inter3_backbone
) here:
https://github.com/dbolya/yolact/blob/13bb0c6322aa35777b73d3ca6522a080588fef03/data/config.py#L775
with yolact_base_config.backbone
and resnet50_dcnv2_backbone
here:
https://github.com/dbolya/yolact/blob/13bb0c6322aa35777b73d3ca6522a080588fef03/data/config.py#L797
with yolact_resnet50_config.backbone
is there an option to train the Yolact++ without deformable convs?
yes: https://github.com/dbolya/yolact/issues/251#issuecomment-577988085
interesting. What are the advantages for Yolact++ (without the DCNv2) comapred to Yolact? Maybe the default code should switch to running yolact++ without the DCN and only make use of it if compiled in?
@breznak The problem is it would require retraining, but the fact that the recent versions of pytorch have deformable convs is very good news. We could potentially avoid this issue entirely. I guess we just need to wait until they add DCNv2 support.
Also, the model performs 1.6-2.8 mAP worth without the deformable convs so it's significant enough that I'd say we want to keep them in.
Perhaps it would be useful to train a version of YOLACT++ without deformable convs? The other improvements would still give a benefit over the base version, and that should work on your GPU.
I was curious about the "other improvements over the previous version [of YOLACT]" ?
The problem is it would require retraining, [...] it's significant enough that I'd say we want to keep them in
yes, but there are already duplicate weights/configs for yolact/yolact++. My idea was if there should be only YOLACT++ (and until generaly available) versions with DCNv1/DCNv2
I was curious about the "other improvements over the previous version [of YOLACT]" ?
Other improvements and impacts on performance / speed are in the YOLACT++ paper:
yes, but there are already duplicate weights/configs for yolact/yolact++. My idea was if there should be only YOLACT++ (and until generaly available) versions with DCNv1/DCNv2
The original YOLACT models are important to verify the claims made in the original paper and to compare the models against future papers.
Honestly, it's probably not that important to have YOLACT++ models without DCN, since the performance is close enough to the original YOLACT models anyway in that case (also, most people here are retraining from scratch anyway instead of just using the COCO trained model). I'd also rather than fix errors in the current DCNv2 compilation pipeline than to force anyone who has errors to use a worse version of the model.
I think we can just wait until Pytorch finally implements DCNv2 and then go from there.
@breznak @dbolya from the accuracy point of view for person detection yolact ++ has good upper hand compared to yolact , i had a hard time to increase the person detection in yolact on the improvements end i had few querstion @dbolya
@abhigoku10
@dbolya thanks for the response Q1 yup something like rotated bounding box Q2 wokay shall try this Q3 yes i want to detect objects which i have not trained and yet in the background , since the network is treating it as background can it give an outline of the background structure
@abhigoku10 Q3: Anything not labeled as an object - is a background. I don't think that help you much, because everything that is not labeled is a backgound, include the sky/tree/house etc...
@sdimantsd yup i need to see if i get atleast the contours of the trees building lamppost structures
@abhigoku10 For Q3, I recommend you use some other method for the contours and such, maybe just mask the contours with the actually detected masks to get some sort of "background contour". Idk if that will be useful for you, but like @sdimantsd said, the network detects all background equally, so you can't really mine objects from that. You could see if there are any detections in the background that have slightly higher (but still low) scores for the rest of the classes, but I don't think that would be worth the effort.
For Q1 yeah that's a whole different research project in its own right. I believe there was an issue a while ago referencing some rotated bbox paper, but it would take a non insignificant amount of research to merge the two models.
@dbolya oh wokay , for mask rcnn there is rotated mask rcnn paper https://github.com/mrlooi/rotated_maskrcnn
Taken from https://github.com/xingyizhou/CenterNet/issues/461 that also uses DCNv2
If this is still open: You need to modify code in the cuda file (Should be src/lib/models/networks/DCNv2/src/cuda/dcn_v2_im2col_cuda.cu), correct:
const int CUDA_NUM_THREADS = 512;
Then compile again. For some reason Aarch64 cannot handle 1024.
In yolact, it's on src/cuda/dcn_v2_im2col_cuda.cu
I've tried this on jetson nano and it worked
@VictimCrasher Thanks! works good :)
If anyone is still having this problem, there is a solution that NVIDIA has posted here: https://forums.developer.nvidia.com/t/pytorch-for-jetson-nano-version-1-5-0-now-available/72048 I think version 1.5 of this pytorch is fixed (cuda10.2 required, has it in JetPack 4.4). Version 1.4 of pytorch should compile it manually, there are instructions at the end of the link. Note that some changes to the source code need to be made (there is a GIT cluster name that describes the required changes). I haven't checked myself yet if it works, I'm currently compiling.
It turns out there is another problem, DCNv2 code needs to be changed as well. Files: src / cuda / dcn_v2_im2col_cuda.cu src / cuda / dcn_v2_psroi_pooling_cuda.cu to change const int CUDA_NUM_THREADS = 1024; To const int CUDA_NUM_THREADS = 512;
And compile again
the firts yolact works fine. In yolact++ i get this error: python3 eval.py --trained_model=weights/yolact_plus_resnet50_54_800000.pth --score_threshold=0.3 --top_k=25 --images=/home/ws/imgs/300/:/home/ws/imgs_out
error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch /home/ws/imgs/300/6.jpg -> /home/ws/imgs_out/6.png error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch error in modulated_deformable_im2col_cuda: too many resources requested for launch ... ... and so on
Do you know what the problem?