generalizing to greater inter-coordinate distances

lucidrains / En-transformer

Implementation of E(n)-Transformer, which incorporates attention mechanisms into Welling's E(n)-Equivariant Graph Neural Network

MIT License

208 stars 28 forks source link

Closed lucidrains closed 2 years ago

lucidrains commented 2 years ago

use the NERF-like technique from https://arxiv.org/abs/2111.09883 attn_bias = mlp(log(dist + 1))

lucidrains commented 2 years ago

unable to substitute rotary embeddings without taking a big hit in perf