kmaninis / OSVOS-PyTorch

PyTorch implementation of One-Shot Video Object Segmentation (OSVOS)
http://vision.ee.ethz.ch/~cvlsegmentation/osvos
GNU General Public License v3.0
564 stars 106 forks source link

official measure code #26

Closed InstantWindy closed 5 years ago

InstantWindy commented 5 years ago

I have a question about the official code "jaccard.py" file and the "f_boundary.py" file.

The parameters of the "db_eval_iou" function in the 'jaccard.py' file, one is the binary annotation map, and the other is the binary segmentation map. The parameter binary segmentation map refers to the network segmentation output with a threshold of 0.5 for binarization. Or is the output of the network binarized with a threshold of 0.5 after the sigmoid activation function?

The parameters of the 'db_eval_boundary' function in the 'f_boundary.py' file, one is the foreground_mask(binary segmentation image.) and the other is gt_mask(binary annotated image). Is the gt_mask referring to the binarization annotation? What is the foreground_mask parameter?

I am a newcomer to learning video segmentation, thank you!

kmaninis commented 5 years ago

Yes, both are binary. The optimal threshold is around 0.8.

InstantWindy commented 5 years ago

But when I set the threshold of 0.8, the iou is 0,I don't know why .My code is as follows: image

scaelles commented 5 years ago

Hello, The code that you provide seems right to me. I would do a couple of things to double check that all the tensor have the appropriate values:

InstantWindy commented 5 years ago

Yeah,Maybe it's because I trained on the Davis 2017 dataset, Davis 2017 dataset labels have multiple categories.If I want to train on the Davis 2017 dataset, how do I set the label values of multiple categories to 0 and 1? My idea is to set the pixel values of the target category to 1 and the background to 0.

scaelles commented 5 years ago

In the multiple object scenario, the evaluation gets a little bit more tricky so I would recommend you using the python package that we provide for DAVIS 2017. If there are multiple objects, each object pixel value should be the same as the one for that object in the first frame.

InstantWindy commented 5 years ago

I use the evaluation code provided by the official davis2017. But this evaluation code requires the input to be a binary map. So how do you convert multi-object labels into binary maps? Thank you!

scaelles commented 5 years ago

I think that the best is that you evaluate the whole sequence and not every frame using this.

In case that you want to evaluate per frame, you have to create yourself a loop evaluating one object at a time. Keep in mind that the DAVIS evaluation is done obtaining the mean of every object in the whole sequence and then doing the mean of all the objects. So don't do the mean of all the objects in each frame.