zkkli / I-ViT

[ICCV 2023] I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Apache License 2.0
140 stars 9 forks source link

Issues on evaluating latency using TVM. #3

Closed rkdgmlqja closed 1 month ago

rkdgmlqja commented 7 months ago

Hi I’m currently working on compiling I-ViT using TVM. On this project, The error appears.

Check failed: value < 1LL << (dtype.bits() - 1) (192 vs. 128) : ValueError: Literal value 192 exceeds maximum of int8

by changing value 192 lower than 128 on build_model.py seems to sove the issue. ` if name == 'deit_tiny_patch16_224':

embed_dim = 192

    embed_dim = 92 
    num_heads = 3

` But, strictly speaking, 'But this method' involves arbitrarily modifying the model's structure, so it is not an appropriate solution. should changing TVM's version solve this issue?

Thanks always.

gihwan-kim commented 6 months ago

@zkkli , @rkdgmlqja I have a same issue. if embed_dim is higher then 128(int8 range), ValueError occurs. But in buid_model.py file, every embed_dim is higher then 128. So it occurs error inevitably. How to solve this problem?

rkdgmlqja commented 1 month ago

The Error was caused by quantized layer norm from layers.py

def quantized_layernorm(data, 
                    bias_int):
    data = relay.cast(data, 'int32')
    mean = relay.mean(data, axis=2, keepdims=True)
    data = data - mean
    data_sq = data * data

This should fix it