pytorch-labs / attention-gym

Helpful tools and examples for working with flex-attention
BSD 3-Clause "New" or "Revised" License
434 stars 22 forks source link

Update softcapping.py #23

Closed Chillee closed 2 months ago