Implement 4bit, 8bit quantization for Nvidia GPUs

anarchy-ai / LLM-VM

irresponsible innovation. Try now at https://chat.dev/

https://anarchy.ai/

MIT License

466 stars 151 forks source link

Open VictorOdede opened 10 months ago

VictorOdede commented 10 months ago

Can be done with GPTQ

mmirman commented 10 months ago

@VictorOdede What do you think sort of time commitment this is?

mmirman commented 10 months ago

If this isn't done with a library its a $200 ticket, if so its a SWAG ticket

VictorOdede commented 10 months ago

If this isn't done with a library its a $200 ticket, if so its a SWAG ticket

This can be done using bitsandbytes library

VictorOdede commented 10 months ago

@VictorOdede What do you think sort of time commitment this is?

A few hours max

bilal-aamer commented 9 months ago

@VictorOdede Is this issue resolved yet?

VictorOdede commented 9 months ago

Hey @bilal-aamer. This has already been implemented with bitsandbytes/gptq. Just doing some tests before merging the PR.