[A3TGCN] This implementation is different with the original paper.

This is a great geometric temporal framework. But there are issues when I read the source code of A3TGCN and A3TGCN2. I read the source code from this repository (based on PyTorch) and compared it with the original paper and original code (based on TensorFlow). Then I found that the attention modules were different in detail. For the original version, the context vector Ct was calculated using the MLP and the hidden states of T-GCN, and $e_i$ should be determined as follows: description of the attention module and the original source code: https://github.com/lehaifeng/T-GCN/blob/e21fb99a5c56fdf667a4ff7a9c8236761f7377f8/A3T-GCN/A3T-GCN.py#L90-L109

But, this implementation just used the parameters self._attention to represent $e_i$ , as follows: https://github.com/benedekrozemberczki/pytorch_geometric_temporal/blob/e43e95d726174574eeae32ad8c4670631fd0a58d/torch_geometric_temporal/nn/recurrent/attentiontemporalgcn.py#L49 https://github.com/benedekrozemberczki/pytorch_geometric_temporal/blob/e43e95d726174574eeae32ad8c4670631fd0a58d/torch_geometric_temporal/nn/recurrent/attentiontemporalgcn.py#L74 https://github.com/benedekrozemberczki/pytorch_geometric_temporal/blob/e43e95d726174574eeae32ad8c4670631fd0a58d/torch_geometric_temporal/nn/recurrent/attentiontemporalgcn.py#L76-L78

Obviously, this is a modified version of A3TGCN, and the attention module has no relationship with the hidden state of the T-GCN module. In my opinion, this modification is significant, and should be marked in docs and related source code, or should be modified to be consistent with the original paper.

Thanks sincerely for your dedication! Best!

benedekrozemberczki / pytorch_geometric_temporal

[A3TGCN] This implementation is different with the original paper. #204