Open sdimantsd opened 4 years ago
If I'm not mistaken, there's a maximum amount of detections (amount of anchors), that can be made at each FPN layer. So there's an intrinsic limit on the total amount of detections (the sum of anchors on all FPN layers).
Hope that helps.
Thx, I thing it's work now. I think there were 2 problems:
@sdimantsd @jasonkena can you please elaborate on the first point i.e the number of anchors , i am having this issue with person detection
@sdimantsd just to understand was there any specific reason to go yolact for these drone images ?? there is one more architecture https://github.com/CosmiQ/yolt which is used for drone / satellite imaging
@abhigoku10, The truth is I just didn't know this network. Thanks.
Now i need to work with YOLACT with rectangular input. It is possible? @dbolya ?
@sdimantsd Glad it helped @abhigoku10, here's the network diagram from the YOLACT++ paper Fig. 2 and the Prediction Head from Fig. 4
So the way the network makes predictions, is by evaluating the Prediction Head, on each of the layers of the Feature Pyramid (P3, P4, P5, P6, P7), and each of these evaluations produce a anchors, so the theoretical limit of the number of proposed detections is 5a. If I'm not mistaken, YOLACT++ uses a=9, for all possible combinations of 3 aspect ratios and 3 scales per FPN layer, effectively limiting the amount of detections to 45.
@sdimantsd can you please post an example evaluation of the network on the image you mentioned (the shrinked one)?
@jasonkena I post it, it's the second image
notice the black area at the bottom of the image
Sorry, I meant the one without the black pixels
It's te second image on my first msg
How about with an increased number of anchors?
The last image is with increasing the number of anchors, and shrink, the first is without. I have no image with increasing the anchors number but without black area.
@jasonkena Haha, your analysis is almost correct, but there are 3 anchor aspect ratios, and 1 anchor scale per layer, totaling 3 anchors for each layer. And that's the number of anchors per feature location per layer, not just per layer. For instance, since P3 is 69x69, there are 69*69*3 anchors on that layer for a total of 14283 anchors for just P2 alone. In total over all layers there are 19248 anchors, so that wouldn't be the issue here.
The actual possible bottlenecks are: NMS is set to output only 100 detections max (since COCO only allows 100 detections), the input for NMS uses the top 200 detections, and when displaying you need to set the --top_k
to the number of detections you want.
And yeah @sdimantsd the issue is the aspect ratio. I have non-box image support on my TODO list, but it will only work if all your images are the same aspect ratio (since training with varied aspect ratio image doesn't work, probably because protonet internally relies on some odd properties of conv layers to do its dirty work, so changing the aspect ratio screws with it).
And yeah @sdimantsd the issue is the aspect ratio. I have non-box image support on my TODO list, but it will only work if all your images are the same aspect ratio (since training with varied aspect ratio image doesn't work, probably because protonet internally relies on some odd properties of conv layers to do its dirty work, so changing the aspect ratio screws with it).
@dbolya COCO dataset have non-box images and different aspect ratio. Does this means if crop all pictures to the same size (and same aspect ratio) performance (mAP) of YOLACT will be much better?
@enhany On the subset of cropped COCO, most likely yeah. Currently, the aspect ratio is stretched all over the place by resizing everything to 550x550 and like you said it's not consistent because of the varying aspect ratios in COCO. Maybe it'd be better to take the most common aspect ratio in COCO, and pad every image to that aspect ratio (disregarding that you'll have fewer pixels to work with if the aspect ratio is different). I might try that out later.
@dbolya, Thx. Do you planning to implement non-box image soon?
@sdimantsd It's at the top of my TODO list, but I have other projects with deadlines coming soon so it might be a while.
Hi @dbolya I saw that option in config.py preserve_aspect_ratio It's look like that option "save" the width/height ratio, right? It's works? if so, this is a very good solution to my original post, since a lote of my image area is not nessesery...
@sdimantsd preserve_aspect_ratio doesn't work very well, but I've implemented that fixed non-square aspect ratio thing I was talking about. I have to do a little bit more testing then I'll push it on Monday. Tentatively it increases mAP on cityscapes (2:1 aspect ratio) by ~1 mAP.
Hi @dbolya , Anythin new with this freature?
Thanks!!!
Hi @dbolya Thanks so much for all your work! Do you have any expectations of when you will be done with this feature?
Hi @dbolya, Do you know when you will be able to push this update?
Hi @dbolya Any new updates on this topic?
@dbolya I'd also be very interested in any updates with regards to being able to train on non-square images, since 16:9 images are incredibly common these days. :smiley:
Hi, im try to overfit YOLACT, but there are objects that are not regularly identified. Attached two images, the first is my segmentations, and the second is the network result. Can somwone help me please?
Im use resnet101 as me backbone, and 700x700 as input.
Thanks!
My segmentations:
Network result: