thunlp / InfLLM

The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
MIT License
269 stars 21 forks source link

是否支持Qwen1.5-7B的量化版本? #32

Closed huliangbing closed 3 months ago

huliangbing commented 4 months ago

非常好的工作!请问InfLLM是否支持Qwen1.5-7B的量化版本?

ChuanhongLi commented 4 months ago

https://github.com/thunlp/InfLLM/issues/16 我之前验证过 Qwen1.5-72B-Chat-GPTQ-Int4,其他的量化模型应该都类似

huliangbing commented 4 months ago

谢谢!效果如果?速度如何?

ChuanhongLi commented 4 months ago

谢谢!效果如果?速度如何?

模型太大,跑的比较慢,简单测试了 LongBench的两组数据:Evaluating on: ['narrativeqa.jsonl', 'qasper.jsonl', 'result.json'] {'narrativeqa': 23.66, 'qasper': 40.86}

ehuaa commented 4 months ago

谢谢!效果如果?速度如何?

模型太大,跑的比较慢,简单测试了 LongBench的两组数据:Evaluating on: ['narrativeqa.jsonl', 'qasper.jsonl', 'result.json'] {'narrativeqa': 23.66, 'qasper': 40.86}

想问下量化的时候校准数据集是怎么选取的呢,是从LongBench中sample的数据么 @ChuanhongLi

huliangbing commented 4 months ago

@ChuanhongLi 请教一下,修改哪个文件?如何修改?

ChuanhongLi commented 4 months ago

sample的数据么 @ChuanhongLi

量化的模型,我们是直接用的开源的,并未采用我们自己量化的模型

ChuanhongLi commented 4 months ago

哪个文件?如何修改?

调整下 n_local, topk, max_cached_block, chunk_size等, 可以参考下这个 https://github.com/thunlp/InfLLM/issues/11

ChuanhongLi commented 4 months ago

哪个文件?如何修改?

调整下 n_local, topk, max_cached_block, chunk_size等, 可以参考下这个 #11

config/qwen-inf-llm.yaml

huliangbing commented 4 months ago

@ChuanhongLi 非常感谢!

ehuaa commented 4 months ago

哪个文件?如何修改?

调整下 n_local, topk, max_cached_block, chunk_size等, 可以参考下这个 #11

config/qwen-inf-llm.yaml

@ChuanhongLi 您好,我用Qwen1.5-72B-chat-AWQ和GPTQ的版本在A100 80G上都爆显存了,想问下config需要具体修改哪些参数到什么数值呢,我用repo中原生的qwen的版本爆显存了,方便粘贴下您跑通的版本么,谢谢

ChuanhongLi commented 4 months ago

方便粘贴下您跑通的版本么,谢谢

block_size: 128 n_init: 128 n_local: 2048 topk: 4 repr_topk: 4 max_cached_block: 4 exc_block_size: 512 score_decay: 0.1 fattn: true base: 1000000 distance_scale: 1.0

max_len: 2147483647 chunk_size: 1024 conv_type: qwen