[FEATURE] Transformers Integration

Short-term. We need to monkeypatch Transformers so AutoModelForCasual.from_pretrained() hook to AutoGPTQ is routed to GPTQModel instead.

For monkey patch there are two paths:
1. Directly monkey patch Transformer code.
2. Monkey patch AutoGPTQ.from_quantized() class method so it is routed to GPTQModel.from_quantized() instead when Transformers does the hook call.
Mid-term. We should also submit PR to Transformers so the quant (AutoGPTQ) integration is a dynamic hook, not a static bound to any pkg. For this to happen, we need to design a shared generic api/hook structure so that GPTQModel and AutoGPTQ can co-exist. in-addition to any future quant packages that would want to hook into the loader/inference.

Target: v0.9.2

ModelCloud / GPTQModel