Modalities / modalities

A framework for training multimodal foundation models.
MIT License
38 stars 3 forks source link

Implementation of multiple attention mechanisms #138

Closed flxst closed 1 month ago

flxst commented 1 month ago

This PR implements manual attention and pytorch flash attention, in addition to the previously implemented dao flash attention. Group Query Attention is supported.