Open Stephen-K1 opened 7 months ago
No. The whole point of our research is to replace conv with GRU. GRU recurrent architecture allows the model to analyze the video sequence with temporal memory. If you replace it with Conv, then it will treat each frame independently. It will have flickers.
I have not been following matting research lately, but here are some ideas just top of my head:
In the RVM model, the GRU layer accounts for a huge number of computations. It is intuitive to ask: would it be better to replace the GRU layer with Conv layer that occupies the same number of computations? A simple answer of 'yes' or 'no' will be greatly appreciated.
Recently I've been trying my best to implement a matting model with excellent performance. I have read many recently proposed video matting papers and test their matting performance. Even RVM was proposed two years ago, it is the best open-sourced (including training code) model in my test results. I wonder if you can provide some tips to improve the performance of RVM? I believe you have a lot of good ideas that are worth trying. It will be greatly appreciated if you can share some of your insights here. Thank you very much!