Closed zp19990818 closed 1 year ago
Hi,
TMO handles each video frame after converting it to 384p resolution and resize the output back to the original resolution. If you want to preserve sharp edges, you can just skip the resizing processes.
B indicates the batch size. It is used to identify if TMO is in the training mode or in the testing mode. The threshold value for adaptive output selection is defined as h.
Hi,
- TMO handles each video frame after converting it to 384p resolution and resize the output back to the original resolution. If you want to preserve sharp edges, you can just skip the resizing processes.
- B indicates the batch size. It is used to identify if TMO is in the training mode or in the testing mode. The threshold value for adaptive output selection is defined as h.
Thank you, I got it!
Hello,
I'm coming back to this. I have two questions about output selection:
I trained as well as tested on the new TMO without using output selection and the metrics show up the same as the previous version of TMO, but the visualized binary map has significant edge jaggedness. I'm curious as to why the visualization results differ, but I'm calculating the same metrics. The presence of edge jaggedness should be lower for miou. The new TMO predicted binary plot is below. (Is it because of parameter B? Is the output soft score?)
On my own dataset, performance drops significantly using output selection. I looked at the code and the purpose of output selection seems to be to compare the percentage of more defined pixels in the saliency map (non-blurred areas). Is this somewhat inappropriate for confidence calculations? For example, structural similarity is not taken into account. The parameter B, here, is meant to be binary, so should the output selection be done when calculating the final score?
Looking forward to your reply~