Add support for GPUs with compute capability lower than 8.0 for awq/kernels installation

I tried to install and run the project on a machine with an NVIDIA Tesla T4 GPU, which has a compute capability of 7.5 (SM 75).

Environment Ubuntu 22.04 with CUDA 12.1

I followed the steps as mentioned here https://github.com/mit-han-lab/llm-awq/tree/main?tab=readme-ov-file#install & encountered the following error during the third step installation process:

cd awq/kernels
python setup.py install

Following error was reported

ptxas /tmp/tmpxft_0000f5ba_00000000-6_gemm_cuda_gen.ptx, line 709; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0000f5ba_00000000-6_gemm_cuda_gen.ptx, line 713; error   : Feature '.m16n8k16' requires .target sm_80 or higher
ptxas /tmp/tmpxft_0000f5ba_00000000-6_gemm_cuda_gen.ptx, line 717; error   : Feature '.m16n8k16' requires .target sm_80 or higher
...
txas fatal   : Ptx assembly aborted due to errors
error: command '/usr/local/cuda-12.1/bin/nvcc' failed with exit code 255

Root Cause: Feature '.m16n8k16' requires .target sm_80 or higher

Is there a configuration flag or workaround to support GPUs with capacity below 8.0

Efficient-Large-Model / VILA

Add support for GPUs with compute capability lower than 8.0 for awq/kernels installation #45