This pull request implements attention-based building blocks for neural networks using the tch-rs library. The implemented components include:
GeGlu: Gated Linear Unit activation function.
FeedForward: A feed-forward layer with GeGlu activation.
CrossAttention: Cross-attention layer for query-key-value attention.
BasicTransformerBlock: A basic Transformer block composed of cross-attention and feed-forward layers.
SpatialTransformer: A spatial transformer model (also known as Transformer2DModel) that applies a series of BasicTransformerBlock layers.
AttentionBlock: An attention block that performs self-attention on the input tensor
This pull request implements attention-based building blocks for neural networks using the tch-rs library. The implemented components include: