AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.8k stars 7.97k forks source link

Non-local block #4038

Open as1392 opened 5 years ago

as1392 commented 5 years ago

https://arxiv.org/pdf/1711.07971.pdf It improves >1% mAP for mask r-cnn, even with backbone X-152. It seems worth to implement.

AlexeyAB commented 5 years ago

On the task of video classification, even without any bells and whistles, our nonlocal models can compete or outperform current competition winners on both Kinetics and Charades datasets. In static image recognition, our non-local models improve object detection/segmentation and pose estimation on the COCO suite of tasks. Code is available at https://github.com/facebookresearch/video-nonlocal-net.

The authors of [7] have shown that I3D models are more accurate than their CNN+LSTM counterparts.

CuongNguyen218 commented 5 years ago

Hi @AlexeyAB , I want to use non-local block, how to do it. I think darknet lack the matrix operator like transpose, or st like pytorch or numpy

AlexeyAB commented 5 years ago

@CuongNguyen218 Do you need just one transpose-layer (exchange weidth <-> height) or some other layers for non-local block?

CuongNguyen218 commented 5 years ago

@AlexeyAB , Yes, at that time, i just want transpose layer and dot product operator

as1392 commented 4 years ago

https://arxiv.org/pdf/1811.11721.pdf https://github.com/speedinghzl/CCNet Well, CCNet claims it is better than non-local blocks(mAP, FLOPS, memory) and it has pytorch implementations. CCNet could be more worth implemented.