ModelCloud / GPTQModel

Apache License 2.0
21 stars 8 forks source link

[FEATURE] Intel/Habana HPU Support #94

Open Qubitium opened 1 week ago

Qubitium commented 1 week ago

Integrate intel/Habana HPU kernel support for uint4 inference on Habana HPU. This was merged into AutoGPTQ https://github.com/AutoGPTQ/AutoGPTQ/pull/689/files but there are no ci tests and we have no access to habana hpu to even test this.

For this to work. We need:

  1. Consistent access to habana HPU for ci testing/regression testing on all future releases. We will not merge any feature we cannot regression test against commits.
  2. Create CI test to ensure the code works with valid PPL pre/post quantization on Habana HPU.

@HolyFalafel As the creator of the AutoGPTQ PR for habana HPU integration, do you know how or where to get developer access to Habana HPU for testing? Do you have contacts at Intel to allow us to get some credits for future regression testing. Without consistent access to Habana HPU, we will not even attempt to merge this feature since we will only merge that we can vouch and regression test for all future changes. Thanks!

HolyFalafel commented 1 week ago

@Qubitium, I've forwarded it to relevant people in Habana. They'll reply soon