Open Moreh-LeeJunhyeok opened 10 months ago
what is a status of this issue? is this on progress? thanks in advance
Hi @Moreh-LeeJunhyeok, thanks for opening an issue!
Note - making experiments easier isn't in and of itself enough of a reason to add something to a model. However, as there was the equivalent added to Llama, this seems reasonable. Could you open a PR and we can review your proposed changes?
sorry for disturb. Is there any article about the difference in proformance when training Llama with or without attention bias opened?
Feature request
System Info
transformers version: 4.36.2
Who can help?
don't have a clue about this
Information
Refer to llama2 modeling code, I want to add attention bias option in mixtral model and configuration for flexibility of experiments.
If this changes seems appropriate, I will make a PR for it
Expected behavior
After changes, attention bias option of model is added in config.
Can be controlled like example below(default config value is false)
Motivation
Refer to llama2 modeling code, I want to add attention bias option in mixtral model and configuration for flexibility of experiments.
Your contribution
I have created a fix branch. I can make a PR of it refer to link