weihai-98 / A-2Q

5 stars 2 forks source link

Support for heterogeneous graphs #3

Open Waleed794 opened 2 weeks ago

Waleed794 commented 2 weeks ago

Hello, Thank you for sharing the code for A-2Q!

I'm interested in applying A-2Q to heterogeneous graphs, specifically using RGCNs. I've replaced the standard Linear layers with your provided QLinear layers and modified the forward function accordingly. I have also used PyTorch's dynamic quantization after the model training.

While testing with OGB-MAG dataset, the accuracy seems to be preserved. However, I've observed an increase in both inference time and memory usage. Is this expected behavior for heterogeneous graphs with A-2Q? Any insights or suggestions you might have would be greatly appreciated.

If this approach is a viable option for heterogeneous graphs, I'd be happy to contribute code to support RGCNs in your repository.

weihai-98 commented 2 weeks ago

Thanks for your interest in our work! Because GPUs cannot natively support computations with mixed-precision quantization, this leads to higher memory overhead and inference latency. Consider a custom cuda kernel for mixed-precision quantized models to solve this problem.

Waleed794 commented 2 weeks ago

Thanks for the prompt reply. I am currently running the training and inference on CPU and not GPU.