keras-team / keras-nlp

Modular Natural Language Processing workflows with Keras
Apache License 2.0
758 stars 227 forks source link

Fixes for the LLaMA backbone + add dropout #1499

Closed tirthasheshpatel closed 5 months ago

tirthasheshpatel commented 6 months ago

Get the LLaMA backbone and its components closer to Mistral. Later, it would be nice to unify them to reduce code duplication. Tests haven't been modified, so the behavior of the components remain the same.

Note that I don't update the checkpoint file in this PR; it would be better to do it once the other components (preprocessor and generator) are in.