axolotl-ai-cloud / axolotl

Go ahead and axolotl questions
https://axolotl-ai-cloud.github.io/axolotl/
Apache License 2.0
7.47k stars 806 forks source link

Migrate multipack to refactored flash attention #1774

Open casper-hansen opened 1 month ago

casper-hansen commented 1 month ago

āš ļø Please check that this feature request hasn't been suggested before.

šŸ”– Feature description

Transformers recently refactored flash attention. At current point in time, axolotl replaces the _get_unpad_data for its multipack feature. Instead, replace the central _get_unpad_data in the new transformers. https://github.com/huggingface/transformers/pull/31446

āœ”ļø Solution

Replace the long if-statement in monkeypatch/multipack and directly replace the centralized unpad function:

        transformers.modeling_flash_attention_utils._get_unpad_data = (  # pylint: disable=protected-access
            get_unpad_data
        )

ā“ Alternatives

No response

šŸ“ Additional Context

No response

Acknowledgements

winglian commented 1 month ago

this should be fixed with #1773 , I just need to do some manual testing