GRU fix - Githubissues

PeterL1n / RobustVideoMatting

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

https://peterl1n.github.io/RobustVideoMatting/

GNU General Public License v3.0

8.32k stars 1.11k forks source link

GRU fix #250

Open Jerry-Master opened 10 months ago

Jerry-Master commented 10 months ago

Looking at your formulas in the article I see your implementation of the GRU does not coincide with the code you provide. I don't want you to merge this fork since it would break compatibility. But I leave it here in case you want to discuss the performance of this fixed ConvGRU implementation. It seems you are recycling the hidden states as if it was the forward activation. It is a valid approach, but I see more reasonable to separate between hidden state and forward activation.

PeterL1n commented 10 months ago

Unlike LSTM, GRU by design does not have separate hidden and forward output. They share the same. See this diagram.

The (1 - z) was opposite to the paper notation but they are equivalent. So I believe my original implementation was correct.

Jerry-Master commented 10 months ago

I mean, you say in the article ot is the output of the layer and h is the hidden state. So it makes sense that you pass the output to the next layer and the hidden state to the next time step. I was wondering if you tried, or have some intuition on which option is better in performance because, computationally, they are very similar.