Speed compared to `TorchMD-Net`

atomicarchitects / equiformer_v2

[ICLR 2024] EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations

MIT License

218 stars 27 forks source link

Hi @shenoynikhil

Thanks for the question.

Below are my thoughts:

I do not think parameter count is a good proxy or a good indicator. We can have the same number of parameters but quite different speeds. For example, we can use the same model but with different cutoff radii.
For a better comparison, maybe you can fix the number of layers and use the same cutoff radius. In this case, Equiformer should be slower as well. This is because we (1) use degrees > 1 and (2) use non-linear message in addition to attention. Please see the paper of Equiformer for details and discussion.
As long as you can run training Equiformer, you would be able to compare your results (errors and training time) with my training logs. If the results are similar, then it should be correct.

Let me know if you have any other specific question.

atomicarchitects / equiformer_v2