Closed chewry closed 1 year ago
Hi @chewry, could you please be more specific? What do you mean by fine-tuning failing? Does the network forget how to perform stereo matching or does it simply worsen the disparity estimates?
Yes. the network forget how to perform stereo matching. The finetuned model detects texture of the objects. I use the same augmentation procedures as RAFT-stereo except _resize_sparse_flowmap. (The sparse resizing makes the results worse. Textures regions are more emphasized.)
Did you use the all augmentations in RAFT-stereo (color transform, erasing regions, resizing, horizontal flip, ...)?
Hi, Can you share an output from your finetuned model?
Thank you for sharing of your nice work!
Inspired by your work, I finetuned the model to apply this to another domain. Unfortunately, fine-tuning failed. In order to check whether it is a domain problem, we fine-tuned the model on the provided NeRF-stereo triplet dataset, but failed as well. (Detection of texture rather than object boundary)
Because there is no code for the training, I used the same hyperparameter and augmentation procedures as RAFT-stereo, as written in your paper. Is there anything else to note for training? If you have any tips for training, please give me some advice.