evshiron / rocm_lab

DEPRECATED!
https://are-we-gfx1100-yet.github.io
Other
53 stars 7 forks source link

bitsandbytes 0.39.0? #8

Open ewof opened 1 year ago

ewof commented 1 year ago

do u have plans for it or is it not possible rn

evshiron commented 1 year ago

The BitsAndBytes is built from https://github.com/agrocylo/bitsandbytes-rocm, which is 0.37.2 currently. It's like doing hipify and fix, but I personally don't have the knowledge nor the time to maintain an up-to-date version, so I have no plan for it at the moment.

shermdog commented 1 year ago

I dug into this a little - the hurdle may be porting the 4-bit update over to rocm. It's also over my head but it seems like rocm can support it.

ewof commented 1 year ago

i have most of it ported but the .hip and .hiph files generated by hipify-clang don't work there's a lot of manual editing needed but i think it's because i don't have hipBLASLt which i have been trying to compile for a bit (their install script doesnt support arch based distros)

shermdog commented 1 year ago

I've made some solid progress on porting over 0.39.0 to rocm. It currently compiles and will load models in 4-bit but generation returns gibberish. There's still two major things to port - the wmma matrix bits and fixing bfloat16

https://github.com/TimDettmers/bitsandbytes/compare/main...shermdog:rocm_039?expand=1

evshiron commented 1 year ago

@shermdog

I cloned your repo and tried locally. When running examples/int8_inference_huggingface.py with load_in_4bit=True, it always decoded into the same token, but load_in_8bit=True did worked.

Nice work here, and I am looking forward to your future updates!

evshiron commented 11 months ago

Here is another fork:

Which looks quite promising but I haven't tested it.

ewof commented 10 months ago

got load-in-4bit to work with this and textgen-webui by setting the kQuantizeBlockwise to how it is in the main repo