princeton-nlp / LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
https://arxiv.org/abs/2310.06694
MIT License
533 stars 39 forks source link

AttributeError: module 'flash_attn.flash_attn_interface' has no attribute 'flash_attn_unpadded_func' #33

Closed YanxiZSQ closed 9 months ago

YanxiZSQ commented 9 months ago

run llmshearing/scripts/pruning.sh on A100 have an error

YanxiZSQ commented 9 months ago

I found these functions have been renamed: Upgrading from FlashAttention (1.x) to FlashAttention-2

flash_attn_unpadded_func -> flash_attn_varlen_func flash_attn_unpadded_qkvpacked_func -> flash_attn_varlen_qkvpacked_func flash_attn_unpadded_kvpacked_func -> flash_attn_varlen_kvpacked_func

the composer_llama.py 818 line still use flash_attn_interface.flash_attn_varlen_func