toddnief / bidirectional-reversal

0 stars 0 forks source link

What if we do SFT with no positional embedding...then do it with positional embedding #16

Open toddnief opened 5 months ago