I can imagine how busy you might be. Still if you give some insights I will be very thankful.
In another issue I asked about memory consumption and you mentioned that the 4D cost volume of RAFT is also looked up over the (refinement) iterations, therefore this released implementation of SeparableFlow uses more memory compared to RAFT.
I changed the code in such a way that the 4D cost volume is not looked up during the refinement iterations and based on the paper, it is only used as a base to compute C-u and C-v. That changed the memory consumption only negligibly.
You mentioned in the paper that the 4D cost volume does not have to be stored. But we need it for back propagation. Could you elaborate on how not to store it and how you achieved much reduction in the memory consumption in the original SeparableFlow?
I am also confused about the mentioned " Note that Cu can be computed without storing the intermediate 4D cost volume C" in Sec.3.2.1. How to perform back propagation if you release the GPU memory of 4D cost volume C?
Dear Feihu,
I can imagine how busy you might be. Still if you give some insights I will be very thankful. In another issue I asked about memory consumption and you mentioned that the 4D cost volume of RAFT is also looked up over the (refinement) iterations, therefore this released implementation of SeparableFlow uses more memory compared to RAFT.
I changed the code in such a way that the 4D cost volume is not looked up during the refinement iterations and based on the paper, it is only used as a base to compute C-u and C-v. That changed the memory consumption only negligibly.
You mentioned in the paper that the 4D cost volume does not have to be stored. But we need it for back propagation. Could you elaborate on how not to store it and how you achieved much reduction in the memory consumption in the original SeparableFlow?
Thanks for your support. Azin