I had to change attn implementation initialization. Unexpectedly (at least for me :sweat_smile:), it turned out that it is not possible to specify attention in the model's config.json file. One can only set it as an argument when creating an object (attn_implementation is taken from "kwargs", not "config_dict", see: https://github.com/huggingface/transformers/blob/v4.36.1/src/transformers/configuration_utils.py#L772). So, I add attention to args and give it to config object when it is created.
I hope that this change is OK (new argument to parser + modified AutoConfig.from_pretrained call).
Issue: https://github.com/OpenGVLab/OmniQuant/issues/46
I had to change attn implementation initialization. Unexpectedly (at least for me :sweat_smile:), it turned out that it is not possible to specify attention in the model's config.json file. One can only set it as an argument when creating an object (
attn_implementation
is taken from "kwargs", not "config_dict", see: https://github.com/huggingface/transformers/blob/v4.36.1/src/transformers/configuration_utils.py#L772). So, I add attention to args and give it toconfig
object when it is created.I hope that this change is OK (new argument to parser + modified
AutoConfig.from_pretrained
call).