Open TomSuen opened 3 days ago
So my question is what is the number of nms_ks in the flow_sampler function you used in the watershed sampler? I set it to 3 to get sampling points as much as possible,but it hard to just use these points to reconstruct the ori video, is this normal?
We set nms_ks=15
during the training process. I think nms_ks=3
may be a little small to train the model.
Btw,I found that one of the possible reasons is that the mask are all taken from the first frame. If an object in the first frame does not move, it is difficult for the Watershed algorithm to sample there, resulting in a lack of guidance for this object in the sparse flow guidance sequence, so the reconstruction effect is not ideal, right?
Yes, the watershed algorithm is unable to sample points that remain stationary in the initial frame. However, this may not significantly impact the model's training, as it is unnecessary and even inadvisable to sample every moving part throughout the video.
So my question is what is the number of nms_ks in the flow_sampler function you used in the watershed sampler? I set it to 3 to get sampling points as much as possible,but it hard to just use these points to reconstruct the ori video, is this normal?
We set
nms_ks=15
during the training process. I thinknms_ks=3
may be a little small to train the model.Btw,I found that one of the possible reasons is that the mask are all taken from the first frame. If an object in the first frame does not move, it is difficult for the Watershed algorithm to sample there, resulting in a lack of guidance for this object in the sparse flow guidance sequence, so the reconstruction effect is not ideal, right?
Yes, the watershed algorithm is unable to sample points that remain stationary in the initial frame. However, this may not significantly impact the model's training, as it is unnecessary and even inadvisable to sample every moving part throughout the video.
Okay,thank you for ur reply, When will you open source the training code?
Hi, thank you for such a wonderful work!I would like to ask a question about the preparation of training sets. Notice that you mentioned in the paper
During training, we randomly sample 14 video frames with a stride of 4. ...with a resolution of 256 × 256. We first train ... and directly taking the first frame together with the estimated optical flow from Unimatch.
So my question is what is the number of
nms_ks
in the flow_sampler function you used in the watershed sampler? I set it to 3 to get sampling points as much as possible,but it hard to just use these points to reconstruct the ori video, is this normal?Btw,I found that one of the possible reasons is that the mask are all taken from the first frame. If an object in the first frame does not move, it is difficult for the Watershed algorithm to sample there, resulting in a lack of guidance for this object in the sparse flow guidance sequence, so the reconstruction effect is not ideal, right?