The Calico models currently set the mlp and attention bias to true, which was hard-coded to false in flash and paged llama implementations. This will use the config params set in https://github.com/huggingface/transformers/pull/30031 to set those values properly.
Modifications
[Describe the code changes]
added attention_bias, mlp_bias to config for Flash and Paged Llama implementations (default is False)
set bias in attention and mlp to the config value
Result
[Describe how the changes affects existing behavior and how to test it]
Models should be able to load properly if containing attention and mlp bias
Motivation
[Describe why this change is needed]
The
Calico
models currently set the mlp and attention bias to true, which was hard-coded to false in flash and paged llama implementations. This will use the config params set in https://github.com/huggingface/transformers/pull/30031 to set those values properly.Modifications
[Describe the code changes]
Result
[Describe how the changes affects existing behavior and how to test it]
Models should be able to load properly if containing attention and mlp bias
Related Issues
NA