Open WanBenLe opened 6 months ago
I have fixed the issue of llava-v1.5 in the latest transformers version and supported llava-v1.6 (LLavaNext), can I create a PR for these? with /root/autodl-tmp/wsl/AutoAWQ-main/awq/modules/fused/model.py and others
if input_ids == None and kwargs['past_key_values']==None:
input_ids, self.last_forward_num_tokens = fused_utils.prepare_input_ids(
kwargs['position_ids'], self.last_forward_num_tokens
)
_bsz, seqlen = kwargs['position_ids'].shape
h = kwargs['inputs_embeds']
device = h.device
else:
input_ids, self.last_forward_num_tokens = fused_utils.prepare_input_ids(
input_ids, self.last_forward_num_tokens
)
_bsz, seqlen = input_ids.shape
device = input_ids.device
h = self.embedding(input_ids)
fused_utils.prepare_cache(self.blocks, seqlen)
mask = fused_utils.prepare_attention_mask(
seqlen=seqlen,
start_pos=self.blocks[0].attn.start_pos,
device=device,
type_as=h,
)
Hi @WanBenLe, please create a PR with a description of the issue and how this solves your problem
Hi @WanBenLe, would you mind sharing the exact script for quantization to get llava-v1.6-34b-hf-awq
? @casper-hansen I'm also wondering if the PR will be merged soon
Hi @WanBenLe, would you mind sharing the exact script for quantization to get
llava-v1.6-34b-hf-awq
? @casper-hansen I'm also wondering if the PR will be merged soon
For the unmerged version(AutoAWQ==0.25), the code and examples for llava-next support are here: https://github.com/WanBenLe/AutoAWQ-with-llava-v1.6/blob/main/examples/llavanext.py
If you plan to use multimodal data,(AutoAWQ==0.24): https://github.com/WanBenLe/AutoAWQ-with-quantizer/blob/main/examples/multimodal_inputs_prepare.py https://github.com/WanBenLe/AutoAWQ-with-quantizer/tree/main/examples/multimodal_quant_test.py
Use default calibration data setting will raise loss NaN error (AutoAWQ/tree/main/awq/quantize.py 343), maybe.
你好,请问autoawq 0.2.5支持llava 1.5吗,能给一下示例代码吗,要求的最低transformers版本是什么?
你好,请问autoawq 0.2.5支持llava 1.5吗,能给一下示例代码吗,要求的最低transformers版本是什么?
你要不直接用example的官方示例+我的那个代码试试?应该能跑起来. 下面是两个链接,一个是原始的llava-v1.5pr https://github.com/casper-hansen/AutoAWQ/pull/250 一个是新的等待merged的pr https://github.com/casper-hansen/AutoAWQ/pull/471
你好,请问autoawq 0.2.5支持llava 1.5吗,能给一下示例代码吗,要求的最低transformers版本是什么?
你要不直接用示例的官方示例+我的那个代码试试?应该能运行起来。 下面是两个链接,一个是原始的llava-v1.5pr #250 ,一个是新的等待合并的pr #471
好的,非常感谢,我将尝试一下
@WanBenLe 你好,我在尝试AutoAWQ-with-llava-v1.6 时需要如下问题,请问你知道如何解决吗
Traceback (most recent call last):
File "/home/common/singlefeng/AIGC_TRAIN/AutoAWQ-with-llava-v1.6_20240624/quantize_llava.py", line 22, in
我的量化代码如下 from awq import AutoAWQForCausalLM from transformers import AutoTokenizer
model_path = 'llava-1.5-7b-hf' quant_path = 'llava-1.5-7b-hf-awq'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }
model = AutoAWQForCausalLM.from_pretrained( model_path ) tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model.quantize(tokenizer, quant_config=quant_config)
model.save_quantized(quant_path) tokenizer.save_pretrained(quant_path)
print(f'Model is quantized and saved at "{quant_path}"')
@1SingleFeng 你可以看一下是不是你的model还是数据没有to(cuda)之类的,如果你不打算用cuda可以设置os的cuda设备为空
I try to run llava-v1.6-34b-hf-awq and sucessed, but how can I run the test for Llava-v1.5 ConditionalGeneration? https://github.com/casper-hansen/AutoAWQ/pull/250 The bug of example likely :