Closed AlpinDale closed 7 months ago
This PR adds support for Top-A sampling (invented by BlinkDL from the RWKV team). Based on the min_p implementation here, since min_p is just linear top_a.
This PR adds support for Top-A sampling (invented by BlinkDL from the RWKV team). Based on the min_p implementation here, since min_p is just linear top_a.