Closed EternityZY closed 4 years ago
I could be wrong, but i do not know the other way than manually (or semi-automatic using https://github.com/opencv/cvat for example) annotate frame by frame.
The other way around (that i'm going to try actually) is to generate groundtruth using existing good algorithms, like ViBe, or other algorithms that solve particular problem. More specifically, you can go to https://supervise.ly and use Mask R-CNN to created initial segmentation for classes that you'd like to have as a foreground and then manually correct it where necessary. I guess you can even train FgSegNet there...
Hello @EternityZY. My apologies for the delayed response. Thanks @c58 for your response.
FgSegNet is a scene-specific network where its main focus is to segment objects in stationary cameras (e.g. camera installed along the road).
In this case, if you want to adapt it to new videos, you need to fine-tune FgSegNet with a minimal training set (freeze the weights and fine-tune some layers based on your amount of your training set). Hope this gives you a hint.
(reopen issued as needed)
Thank you for your work, I have a question, I see that the code is training for each test set. So now I have a real video (not in the dataset), no ground truth. So how should I extract the foreground?