Implementation of SE3-Transformers for Equivariant Self-Attention, in Pytorch. This specific repository is geared towards integration with eventual Alphafold2 replication.
Thank you for your work. I used your reproduced SE3 as a part of my model, but the current test effect is not very good. I guess it may be because I do not have a good understanding of your model. Here are my questions:
Does your model need pre-training?
Can I train SE3 Transformer with the full connection layer that comes after it?
Good advice is also welcome
I've found that pre-training helps (100 batches, linear weight scale from 1e-6 up to 1e-4). I've also found that smaller depth (2 or 3) works better than larger depth (>3).
I'm not sure what you mean here. The fully connected layer that acts on type-1 features (i.e. 3d-coordinates) in the attention block? Or the linear projection that projects the final output form the dx3 to 1x3 (i.e. projection from the hidden dimension to output dimension).
Thank you for your work. I used your reproduced SE3 as a part of my model, but the current test effect is not very good. I guess it may be because I do not have a good understanding of your model. Here are my questions: