Hi, thanks for the wonderful code, it is really great work. I have several questions on the pre-training of Synthia.
I saw you only used 10 sequences from the dataset to do the pre-training, why not use more? Is it because as you said in the discussion: unsupervised approaches are limited by the loss function? So even we have more data, the loss function can't lead us to better results? Thanks.
During pre-training, the occlusion is not handled. Is there a reason why you don't use occlusion mask for pre-training?
In your provided config.ini file, the pre-training iteration is 500K, instead of 300K as in the paper.
We did not train on the other sequences mainly because we did not emphasize optimizing the pre-training too much, and just selected a subset of the dataset that was still easy to handle and rather fast to download. Optimizing the pre-training dataset, or specific schedule may lead to small final improvements, but this was not the main focus of our work. Also, i think further improving the loss (e.g. more accurate occlusion estimation, anisotropic regularizers, ...) should have much more impact than improving pre-training.
As the usefulness of the forward-backward-based occlusion handling depends on how well the flow is already estimated, we first trained a model with an "easier" loss to have a good basic flow estimation before trying to learn handling occlusion.
Thanks! I will fix the config.ini. 300K should suffice for convergence.
Hi, thanks for the wonderful code, it is really great work. I have several questions on the pre-training of Synthia.
I saw you only used 10 sequences from the dataset to do the pre-training, why not use more? Is it because as you said in the discussion: unsupervised approaches are limited by the loss function? So even we have more data, the loss function can't lead us to better results? Thanks.
During pre-training, the occlusion is not handled. Is there a reason why you don't use occlusion mask for pre-training?
In your provided config.ini file, the pre-training iteration is 500K, instead of 300K as in the paper.
Thank you very much.