What llama attn replacement to use for SFT-based inference?

dvlab-research / LongLoRA

Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)

http://arxiv.org/abs/2309.12307

Apache License 2.0

2.59k stars 267 forks source link

What llama attn replacement to use for SFT-based inference? #159

Open spring1915 opened 9 months ago

spring1915 commented 9 months ago

I used supervised-fine-tune.py for fine-tuning. Does this mean that in inference.py, should I use from llama_attn_replace_sft import replace_llama_attn in place of your currently specified from llama_attn_replace import replace_llama_attn?