How to handle real video

lim-anggun / FgSegNet_v2

FgSegNet_v2: "Learning Multi-scale Features for Foreground Segmentation.” by Long Ang LIM and Hacer YALIM KELES

https://arxiv.org/abs/1808.01477

Other

149 stars 43 forks source link

How to handle real video #13

Closed EternityZY closed 4 years ago

EternityZY commented 5 years ago

Thank you for your work, I have a question, I see that the code is training for each test set. So now I have a real video (not in the dataset), no ground truth. So how should I extract the foreground?

c58 commented 4 years ago

I could be wrong, but i do not know the other way than manually (or semi-automatic using https://github.com/opencv/cvat for example) annotate frame by frame.

c58 commented 4 years ago

The other way around (that i'm going to try actually) is to generate groundtruth using existing good algorithms, like ViBe, or other algorithms that solve particular problem. More specifically, you can go to https://supervise.ly and use Mask R-CNN to created initial segmentation for classes that you'd like to have as a foreground and then manually correct it where necessary. I guess you can even train FgSegNet there...

lim-anggun commented 4 years ago

Hello @EternityZY. My apologies for the delayed response. Thanks @c58 for your response.

FgSegNet is a scene-specific network where its main focus is to segment objects in stationary cameras (e.g. camera installed along the road).

In this case, if you want to adapt it to new videos, you need to fine-tune FgSegNet with a minimal training set (freeze the weights and fine-tune some layers based on your amount of your training set). Hope this gives you a hint.

(reopen issued as needed)