xuebinqin / BASNet

Code for CVPR 2019 paper. BASNet: Boundary-Aware Salient Object Detection
MIT License
1.35k stars 249 forks source link

How to do object detection with BASNET? #21

Closed anguoyang closed 4 years ago

anguoyang commented 4 years ago

The result on segmentation looks pretty good, but as the result is only a map image, how could I detect the objects in it? for example, detect cats, dogs in it with bounding boxes, I have applied the findcontour function on the resulted map, there are too much contours on a single "area"

anguoyang commented 4 years ago

Thank you @NathanUA

xuebinqin commented 4 years ago

The result on segmentation looks pretty good, but as the result is only a map image, how could I detect the objects in it? for example, detect cats, dogs in it with bounding boxes, I have applied the findcontour function on the resulted map, there are too much contours on a single "area"

RE: Salient Object Detection (SOD) focuses on segmenting the most attractive regions or objects from given images. It is a binary class segmentation problem and you can take it as foreground and background segmentation. It pays no attention to the semantic meaning of the segmented masks.

On Sun, Oct 13, 2019 at 4:37 AM goodman notifications@github.com wrote:

The result on segmentation looks pretty good, but as the result is only a map image, how could I detect the objects in it? for example, detect cats, dogs in it with bounding boxes, I have applied the findcontour function on the resulted map, there are too much contours on a single "area"

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORKWMFBQUKZYXCZJE7DQOL25NA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HRNTHJA, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORL3ZZ2HA7J4A3VVYPTQOL25NANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

Thank you for the explaination, is there similar approach to add SOD with semantic meaning? or is it possible to improve to detect the accurate contour by opencv on the map?

anguoyang commented 4 years ago

What I want to do is to detect objects with few training images(e.g. 1-5 images for each class), my idea is to do SOD first to crop the object areas, and then use a voting like mechnism to judge the class of each cropped area. or just resize the cropped images into same size and do image classification by way of feature comparing. The problem is that we cannot get the accurate rectangle box on each object.

The result of basnet looks pretty good(by eye), but when we use opencv for finding the contour, we cannot find the proper rectangle, but with lots of contours on each object, sorry for the confusion.

xuebinqin commented 4 years ago

You can use our BASNet as preprocessing to produce the exact region masks and add another segmentation network after BASNet for learning the semantic meaning.

On Mon, Oct 14, 2019 at 2:17 AM goodman notifications@github.com wrote:

What I want to do is to detect objects with few training images(e.g. 1-5 images for each class), my idea is to do SOD first to crop the object areas, and then use a voting like mechnism to judge the class of each cropped area.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORPQSPTBOLKFPSETAKDQOQTKLA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBDVSYI#issuecomment-541546849, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORPZON3U4QESKVB7W7LQOQTKLANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

xuebinqin commented 4 years ago

OpenCV is able to extract the contour of each region. But I don't think opencv is able to find the exact contour of each object. You could try instance segmentation.

On Tue, Oct 15, 2019 at 7:47 PM goodman notifications@github.com wrote:

It seems that it is difficult to the "exact" region masks by way of opencv(findContour function), could you please share me some ideas? thank you

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORKP722UI455ULFJ4VTQOZXEDA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBKYAII#issuecomment-542474273, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORK6SF3G7CXXSH3M4FDQOZXEDANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

xuebinqin commented 4 years ago

Or try the idea of MASK-RCNN.

On Tue, Oct 15, 2019 at 7:51 PM Xuebin Qin xuebin@ualberta.ca wrote:

OpenCV is able to extract the contour of each region. But I don't think opencv is able to find the exact contour of each object. You could try instance segmentation.

On Tue, Oct 15, 2019 at 7:47 PM goodman notifications@github.com wrote:

It seems that it is difficult to the "exact" region masks by way of opencv(findContour function), could you please share me some ideas? thank you

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORKP722UI455ULFJ4VTQOZXEDA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBKYAII#issuecomment-542474273, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORK6SF3G7CXXSH3M4FDQOZXEDANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

yes, exactly, when I used opencv, I will get about 15 contours on single object, it is difficult to get the exact rectangle on the object.

Do you mean apply mask-rcnn on your result map image?

xuebinqin commented 4 years ago

What do you mean about 15 contours on single object?

On Tue, Oct 15, 2019 at 7:54 PM goodman notifications@github.com wrote:

yes, exactly, when I used opencv, I will get about 15 contours on single object, it is difficult to get the exact rectangle on the object.

Do you mean apply mask-rcnn on your result map image?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORNMYKKQ3NCVTDGF4UTQOZX5PA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBKYKWQ#issuecomment-542475610, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGOROCRYG4Y3QFVJLL5DTQOZX5PANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

yes, sometimes more, but by my eye, the object area is pure white, which is very accurate, but when I draw the contours as rectangle with opencv, I found about 15 rectangle on the object

anguoyang commented 4 years ago

cnts, hierarchy = cv2.findContours(output, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) for cnt in cnts: (x,y,w,h) = cv2.boundingRect(cnt) cv2.rectangle(outframe, (x,y), (x+w,y+h), (0,255,0), 2)

like this to draw contours, the output is the resized image from the result map, which is same with your code

anguoyang commented 4 years ago

I uploaded the result image here: https://ibb.co/vdWJK7F

xuebinqin commented 4 years ago

Why not extract the maximally connected component directly.

On Tue, Oct 15, 2019 at 8:04 PM goodman notifications@github.com wrote:

cnts, hierarchy = cv2.findContours(output, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) for cnt in cnts: (x,y,w,h) = cv2.boundingRect(cnt) cv2.rectangle(outframe, (x,y), (x+w,y+h), (0,255,0), 2)

like this to draw contours

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORPCPISRBAAE65QTOLDQOZZBTA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBKY2JY#issuecomment-542477607, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORPB4XI57JW5CIU6YTDQOZZBTANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

xuebinqin commented 4 years ago

What does the output saliency map look like ?

On Tue, Oct 15, 2019 at 8:10 PM Xuebin Qin xuebin@ualberta.ca wrote:

Why not extract the maximally connected component directly.

On Tue, Oct 15, 2019 at 8:04 PM goodman notifications@github.com wrote:

cnts, hierarchy = cv2.findContours(output, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) for cnt in cnts: (x,y,w,h) = cv2.boundingRect(cnt) cv2.rectangle(outframe, (x,y), (x+w,y+h), (0,255,0), 2)

like this to draw contours

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORPCPISRBAAE65QTOLDQOZZBTA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBKY2JY#issuecomment-542477607, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORPB4XI57JW5CIU6YTDQOZZBTANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

if there is only one object, yes, we can do it with max connected component, but some times there are 2 or 3 objects which near each other, we cannot differentiate them

anguoyang commented 4 years ago

the saliency map is this one: https://ibb.co/vdWJK7F

I just added/drawed the contour rectangles on it

xuebinqin commented 4 years ago

I know that Watershed is ok to separate simple objects. But you want to separate complex shapes into different objects that would be a bit hard for unsupervised method. So I suggest to try instance segmentation https://arxiv.org/pdf/1703.10277.pdf, http://www.ipb.uni-bonn.de/wp-content/papercite-data/pdf/milioto2019icra-fiass.pdf .

On Tue, Oct 15, 2019 at 8:13 PM goodman notifications@github.com wrote:

if there is only one object, yes, we can do it with max connected component, but some times there are 2 or 3 objects which near each other, we cannot just judge the area by contour

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGOROP7TQRACKE7V5MCK3QOZ2CXA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBKZH6Q#issuecomment-542479354, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORLI5KZPRF5C25DR3QTQOZ2CXANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

I uploaded the silency map without contours here: https://ibb.co/dkF8QJL

xuebinqin commented 4 years ago

It looks just a circular blob ?

On Tue, Oct 15, 2019 at 8:17 PM goodman notifications@github.com wrote:

I uploaded the silency map without contours here: https://ibb.co/dkF8QJL

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORMUXZIEG4IZVWGNV23QOZ2VDA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBKZQCA#issuecomment-542480392, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORNWIX6SSOUVO7QETVDQOZ2VDANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

it is actually an orange(moving with the belt), as a sample, we also need to detect other shapes

xuebinqin commented 4 years ago

OK, I see. if the object shapes are relatively simple (without many concave structures) like the following target. You can try the watershed segmentation method based on the results of our BASNet. Please refer to the watershed: https://scikit-image.org/docs/dev/auto_examples/segmentation/plot_watershed.html(skimage)

https://docs.opencv.org/3.4/d2/dbd/tutorial_distance_transform.htmlds (opencv). [image: 1111.png]

On Tue, Oct 15, 2019 at 8:21 PM goodman notifications@github.com wrote:

it is actually an orange, as a sample, we also need to detect other shapes

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORI62X5I77B4PXZ3YLTQOZ3DRA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBKZWIY#issuecomment-542481187, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJ72BHPNIKGKPFAYETQOZ3DRANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

xuebinqin commented 4 years ago

https://scikit-image.org/docs/dev/auto_examples/segmentation/plot_watershed.html https://scikit-image.org/docs/dev/auto_examples/segmentation/plot_watershed.html(skimage) (skimage) https://docs.opencv.org/3.4/d2/dbd/tutorial_distance_transform.htmlds (opencv).

On Tue, Oct 15, 2019 at 8:26 PM Xuebin Qin xuebin@ualberta.ca wrote:

OK, I see. if the object shapes are relatively simple (without many concave structures) like the following target. You can try the watershed segmentation method based on the results of our BASNet. Please refer to the watershed: https://scikit-image.org/docs/dev/auto_examples/segmentation/plot_watershed.html(skimage)

https://docs.opencv.org/3.4/d2/dbd/tutorial_distance_transform.htmlds (opencv). [image: 1111.png]

On Tue, Oct 15, 2019 at 8:21 PM goodman notifications@github.com wrote:

it is actually an orange, as a sample, we also need to detect other shapes

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORI62X5I77B4PXZ3YLTQOZ3DRA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBKZWIY#issuecomment-542481187, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJ72BHPNIKGKPFAYETQOZ3DRANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

Thank you so much, I will try to check this papers and code. The original image is: https://ibb.co/h8TsDNh

xuebinqin commented 4 years ago

https://docs.opencv.org/3.4/d2/dbd/tutorial_distance_transform.html (opencv)

On Tue, Oct 15, 2019 at 8:26 PM Xuebin Qin xuebin@ualberta.ca wrote:

https://scikit-image.org/docs/dev/auto_examples/segmentation/plot_watershed.html https://scikit-image.org/docs/dev/auto_examples/segmentation/plot_watershed.html(skimage) (skimage) https://docs.opencv.org/3.4/d2/dbd/tutorial_distance_transform.htmlds (opencv).

On Tue, Oct 15, 2019 at 8:26 PM Xuebin Qin xuebin@ualberta.ca wrote:

OK, I see. if the object shapes are relatively simple (without many concave structures) like the following target. You can try the watershed segmentation method based on the results of our BASNet. Please refer to the watershed: https://scikit-image.org/docs/dev/auto_examples/segmentation/plot_watershed.html(skimage)

https://docs.opencv.org/3.4/d2/dbd/tutorial_distance_transform.htmlds (opencv). [image: 1111.png]

On Tue, Oct 15, 2019 at 8:21 PM goodman notifications@github.com wrote:

it is actually an orange, as a sample, we also need to detect other shapes

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORI62X5I77B4PXZ3YLTQOZ3DRA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBKZWIY#issuecomment-542481187, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJ72BHPNIKGKPFAYETQOZ3DRANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

xuebinqin commented 4 years ago

u're very welcome.

On Tue, Oct 15, 2019 at 8:28 PM goodman notifications@github.com wrote:

Thank you so much, I will try to check this papers and code. The original image is: https://ibb.co/h8TsDNh

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORPJDN6N6JVAWDFCJR3QOZ34VA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBK2AMI#issuecomment-542482481, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJSAXVPZYTELAJQZ53QOZ34VANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

I have re-train the model with my specific data(about 4000 images and masks, without the DUTS-TR data), the result is perfect after apply the shed algorithm, even if I trained only 40000 epocs(the loss and tar all below 1.0. I uploaded the result image here: https://ibb.co/CwX5K82

anguoyang commented 4 years ago

by the way, is there any algorithms for pixel wise classification? it seems the CNN feature extractors are always based on rectangle(resize into NxN, and forward to the neural network), but I suppose that pixel wise on the polygon is better

xuebinqin commented 4 years ago

That looks pretty good. Congratulations.

On Thu, Oct 17, 2019 at 9:03 PM goodman notifications@github.com wrote:

I have re-train the model with my specific data(about 4000 images and masks, without the DUTS-TR data), the result is perfect after apply the shed algorithm, even if I trained only 40000 epocs(the loss and tar all below 1.0. I uploaded the result image here: https://ibb.co/CwX5K82

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORMV5ZJQWHDZYURZIRLQPEROZA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBSKPXI#issuecomment-543467485, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORN3LGI5LVPAPIPRSDDQPEROZANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

xuebinqin commented 4 years ago

Sorry I can't get your point.

On Thu, Oct 17, 2019 at 9:06 PM goodman notifications@github.com wrote:

by the way, is there any algorithms for pixel wise classification? it seems the CNN feature extractors are always based on rectangle(resize into NxN), but I suppose that pixel wise on the polygon is better

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORKS7N2NYWHVMYPCEK3QPER33A5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBSKWMA#issuecomment-543468336, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORMOZ4TY2JE6VRTG53DQPER33ANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

after I get the accurate mask with shed on each object, I will do image classification in next phase. I will crop the ROI within each mask area on the original image, for example, in the result image, I want to find out which class the green area belongs to, I will try to crop it on the original image, and try to compute the feature on this cropped image(without any background, just the foreground),and compare the features with the features in my database(enrol some features beforehand, with the same feature extraction policy) to find out which class it belong to

anguoyang commented 4 years ago

which is similar with face recognition system, we first extract face features into face database(in enrolment phase), and in the recognition phase, we do face detection and extract the face feature and compare with the face database to find out who he is. The difference is, in this application, the object area is not rectangle, without background

xuebinqin commented 4 years ago

OK I see. That's actually a good idea. Many people have thought about that. But that is too hard to implement(orientation, scale have to be considered). Almost all of the current existing methods (discriminative methods) are using the images cropped from the bounding boxes. If the background is not too much. Bounding box is fine. Otherwise you could try some generative methods like gaussian mixture model and so on.

On Thu, Oct 17, 2019 at 9:19 PM goodman notifications@github.com wrote:

after I get the accurate mask with shed on each object, I will do image classification in next phase. I will crop the ROI within each mask area on the original image, for example, in the result image, I want to find out which class the green area belongs to, I will try to crop it on the original image, and try to compute the feature on this cropped image(without any background, just the foreground),and compare the features with the features in my database(enrol some features beforehand, with the same feature extraction policy) to find out which class it belong to

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORNHQOBFYOOXTTSZ2VLQPETNVA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBSLUEI#issuecomment-543472145, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORM7STCTJ5JSY4SJXPDQPETNVANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

Ok, thank you, I will try these methods, the reason why I would like to discard the background is , if with bounding boxes, the background will probably introduce other foreground objects, although part of it. for example, if a long french bread being placed in 45° angle from left-top to bottom-right, and another small bread being placed side by side:)

anguoyang commented 4 years ago

@NathanUA Btw, do you have plan to add your mobilenet version, could you please give advice on how to make it more faster by way of modify on backbone, thanks

xuebinqin commented 4 years ago

Sorry, I don't have that plan for a mobilenet version within the next several weeks. We just develop another smaller model which achieves competitive results against the SOTA models and we may release it soon.

Ways of making it faster: (1) use smaller input size (2) remove the pre-trained resnet-34 and implement the resnet-34 by yourself while using smaller filter numbers (3) try to implement a mobilenet or shufflenet version ...

On Tue, Jan 7, 2020 at 10:00 PM goodman notifications@github.com wrote:

@NathanUA https://github.com/NathanUA Btw, do you have plan to add your mobilenet version, could you please give advice on how to make it more faster by way of modify on backbone, thanks

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORIOM6CIYANEWP6CAM3Q4VMW5A5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEILGBII#issuecomment-571891873, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORN4SNYVVRTUUKMHRELQ4VMW5ANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

sounds great!

anguoyang commented 4 years ago

is there any advice on data augmentation? I found that your training data just used v-flip and the pretrained model got quite good result, however, when I did the same v-flip operation on training data(your default DUTS-TR and my specific training data), and do inference on val images(similar with my specific training data, same objects in it), the result is even bad than your pretrained model, which is really strange, I saw the training log, the tar loss is about 0.8 and iters is about 180k(I dont remember clearly, but very big, spent about 1 week training on 1080ti)

anguoyang commented 4 years ago

@NathanUA my specific training data generated by way of inference with your model, which means, I used your default model with basnet_test.py on my original images, because there is only 1 object on a clean background, so the result image is quite good, and I used these result images(with v-flip) together with DUTS-TR for training my model

xuebinqin commented 4 years ago

Sorry, I may misunderstood your idea. One question: if the results inferred by the pretrained model is good, why do you retrain the model ? However, if there are still some bad inference results and they are taken as part of the training images, these results will distract the network and give the worse results undoubtedly. My suggestions is: (1) if you don't want to label data manually, you could try to combine DUTS-TR, DUTS-TE, DUT-OMRON, HKU-IS, PASCAL-S and SOD together to build a bigger dataset to further improve the BASNet. (2) label you dataset manually and retrain the network w/ or w/o the pretrained model. (3) label your dataset manually and combine it with DUTS-TR (w/o knowing what your data looks like, this suggestion is not recommended).

Best of luck.

Regards, Xuebin

On Fri, Jan 31, 2020 at 2:28 AM goodman notifications@github.com wrote:

@NathanUA https://github.com/NathanUA my specific training data generated by way of inference with your model, which means, I used your default model with basnet_test.py on my original images, because there is only 1 object on a clean background, so the result image is quite good, and I used these result images(with v-flip) together with DUTS-TR for training my model

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORIY7TZUU6BPKA22ZATRAPVMZA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKOBXQQ#issuecomment-580656066, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORLGEWOKXUEBJBWRSNDRAPVMZANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

Thank you so much for your response. 1.if the results inferred by the pretrained model is good, why do you retrain the model ? The results on some ideal/simple images(which means, the background is clean, e.g. a table in pure color, and contains only 1-2 objects on it ) is quite good, but if with some complicated images(several objects placed side-by-side closely), sometimes will generate false result, so I need to combine your training data and my specific training data(simple images as above) for training, but in the eval/testing phase, the images are always hard(the background is not so complicated, but the there are always several objects in it, and always placed side-by-side) , we did this because we dont want to label those hard images, it needs quite a lot of time.

anguoyang commented 4 years ago

btw, what does w/ or w/o mean?

xuebinqin commented 4 years ago

w/ denotes "with" w/o means "without".

On Sat, Feb 1, 2020 at 2:16 AM goodman notifications@github.com wrote:

btw, what does w/ or w/o mean?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORIMBHAUWC5JVVYAA43RAU4VJA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKQYLPQ#issuecomment-581010878, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORPLTTG25Z4X7BJSUJ3RAU4VJANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

xuebinqin commented 4 years ago

Sorry for the late reply. I am not asking for your answers about this question. Just want to remind you why your pipeline desn't work. There is no free lunch. Without correctly labeled dataset, it usually hard to get good results. Maybe you could try GAN for generating virtual training dataset.

On Sat, Feb 1, 2020 at 2:01 AM goodman notifications@github.com wrote:

Thank you so much for your response. 1.if the results inferred by the pretrained model is good, why do you retrain the model ? The results on some ideal/simple images(which means, the background is clean, e.g. a table in pure color, and contains only 1-2 objects on it ) is quite good, but if with some complicated images(several objects placed side-by-side closely), sometimes will generate false result, so I need to combine your training data and my specific training data(simple images as above) for training, but in the eval/testing phase, the images are always hard(the background is not so complicated, but the there are always several objects in it, and always placed side-by-side) , we did this because we dont want to label those hard images, it needs quite a lot of time.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanUA/BASNet/issues/21?email_source=notifications&email_token=ADSGORMPSGYZIURCNZ6565DRAU26TA5CNFSM4JAGMUZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKQYD3Y#issuecomment-581009903, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORLRSVA36V6N24SR47TRAU26TANCNFSM4JAGMUZA .

-- Xuebin Qin PhD Candidate Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

anguoyang commented 4 years ago

thank you, btw, PASCAL-S is different with others because the object in the label image is not white, it is in grey, is it good to train with other datasets?

xuebinqin commented 4 years ago

no you can’t use pascal-s directly. Because there are multiple classes. You can binarize it with the mean of each ground truth.

Sent from my iPhone

On Feb 10, 2020, at 5:47 AM, goodman notifications@github.com wrote:

thank you, btw, PASCAL-S is different with others because the object in the label image is not white, it is in grey, is it good to train with other datasets?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

anguoyang commented 4 years ago

thanks, I collected about 160k images/labels pair:) I found that HKU-IS and MSRA10K/B are really good data, now in training, I plan to train a base model, and then finetune on my specific data(especially those bad result images, label it manually), hope to get better result

anguoyang commented 4 years ago

Hi@NathanUA , I have trained with huge dataset(160k images, including augmentation) for about 40 epochs, it got better result than before, however, when I load the epoch-40 model as pretrained model, and continue train with my specific data(manually annotated, about 230 images including augmentation), the train loss and tar drop greatly, however, when I tested on it, it was even worse than the epoch-40 model, which is beyond my expectation, could you please give some advice? thank you