bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.
https://huggingface.co/docs/bitsandbytes/main/en/index
MIT License
6.14k stars 616 forks source link

MPS progress #1326

Open tcdent opened 1 month ago

tcdent commented 1 month ago

Got Metal -> pyobjc++ -> Python loading. quantize_mps doesn't segfault. Start of quantize_nf4 in the public API.

Builds with: $ cmake -DCOMPUTE_BACKEND=mps -S . && make

import ctypes
import torch
from bitsandbytes import lib
from bitsandbytes.functional import get_ptr, get_4bit_type

lib_path = "bitsandbytes/libbitsandbytes_mps.dylib"
lib = ctypes.CDLL(lib_path)

blocksize, n, quant_type = 64, 1024, "nf4"
A = torch.rand(n).float()
code = get_4bit_type(quant_type, device=A.device)
absmax = torch.zeros(n // blocksize).float()
out = torch.zeros((n + 1) // 2).byte()

lib.quantize_mps(
    get_ptr(code),
    get_ptr(A),
    get_ptr(absmax),
    get_ptr(out),
    ctypes.c_int32(blocksize),
    ctypes.c_int(n)
)
print(out[:10])

tensor([ 51, 51, 255, 255, 51, 51, 51, 51, 51, 51], dtype=torch.uint8)

Titus-von-Koeller commented 1 month ago

Hey @tcdent, really cool, looks nice! Happy to see that you're taking the initiative 🤗

However, any specific reason why you merged main? We're keeping things separate for the time being, so this messes up the diff and we also don't want everything from main in multi-backend-refactor; especially not mixed in with a PR, makes it hard to review.

Happy to give this a look soon. Better to revert the merge from main though, if you can.

tcdent commented 1 month ago

Sorry, sloppy on my part. This work is a tangent of some tests I was trying to do, which needed both branches to be up-to-date.

Is a revert OK? I can also re-open a fresh PR.