[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents
286
stars
15
forks
source link
add flash attention support in training to save memory and speed up #35
Closed
lighten001 closed 11 months ago
src/model/flash_attn_patch.py
LlamaAttention.forward
andLlamaModel._prepare_decoder_attention_mask
use_cache
should beFalse