bkitano / llama-from-scratch

Llama from scratch, or How to implement a paper without crying
https://blog.briankitano.com/llama-from-scratch/
482 stars 46 forks source link

TypeError: MultiheadAttention.forward() got an unexpected keyword argument 'is_causal' #1

Closed bjpcjp closed 10 months ago

bkitano commented 10 months ago

I updated the notebook to no longer use MultiheadAttention, as it was an erroneous implementation given the context.