Open supersuperbang opened 7 months ago
Distributor ID: Ubuntu Description: Ubuntu 22.04.4 LTS Release: 22.04 Codename: jammy Python 3.10.12
import bitsandbytes as bnb from bitsandbytes import functional as F import torch bnb.nn.Linear8bitLt(1, 2, bias=True, has_fp16_weights=False, threshold=6.0) A1 = torch.zeros(0, 0, 0, device='cpu') A1 = torch.ones(1, 1, 1, device='cpu') C, S = F.quantize_blockwise(A1, blocksize=2**64) print(C)
crash backtrace
Thread 1 "python3" received signal SIGFPE, Arithmetic exception. 0x00007fffd7e035a6 in quantize_cpu(float*, float*, float*, unsigned char*, long long, long long) () from /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cpu.so [ Legend: Modified register | Code | Heap | Stack | String ] ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ──── $rax : 0x1 $rbx : 0x00007fffffffd9b0 → 0x0000003000000028 ("("?) $rcx : 0x0000555558055440 → 0x0000555557f24900 → "_\.]?[0-9]*)? # dev release\n [...]" $rdx : 0x0 $rsp : 0x00007fffffffd740 → 0x0000000000000001 $rbp : 0x00007fffffffd840 → 0x00007fffffffd880 → 0x00007fffffffd890 → 0x0000000000000006 $rsi : 0x0000555557fba540 → 0x000000003f800000 $rdi : 0x00005555580feb40 → 0xbf7a9999bf800000 $rip : 0x00007fffd7e035a6 → <quantize_cpu(float*,+0> idiv QWORD PTR [rbp-0xf8] $r8 : 0x0 $r9 : 0x1 $r10 : 0x0 $r11 : 0x00007fffd7e052c3 → <cquantize_blockwise_cpu_fp32+0> push rbp $r12 : 0x8 $r13 : 0x00007fffffffd9f0 → 0x00007fffffffdab0 → 0x00007ffff7705570 → 0x0000000000000008 $r14 : 0x00007ffff7705570 → 0x0000000000000008 $r15 : 0x00007ffff7582110 → 0xffffb400ffffb3ac $eflags: [zero carry parity adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification] $cs: 0x33 $ss: 0x2b $ds: 0x00 $es: 0x00 $fs: 0x00 $gs: 0x00 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ──── 0x00007fffffffd740│+0x0000: 0x0000000000000001 ← $rsp 0x00007fffffffd748│+0x0008: 0x0000000000000000 0x00007fffffffd750│+0x0010: 0x0000555558055440 → 0x0000555557f24900 → "_\.]?[0-9]*)? # dev release\n [...]" 0x00007fffffffd758│+0x0018: 0x0000555557faea80 → 0x0000555500000000 0x00007fffffffd760│+0x0020: 0x0000555557fba540 → 0x000000003f800000 0x00007fffffffd768│+0x0028: 0x00005555580feb40 → 0xbf7a9999bf800000 0x00007fffffffd770│+0x0030: 0x00007fffffffd950 → 0x0000555555c49060 → 0x0000000000000014 0x00007fffffffd778│+0x0038: 0x0000000000000001 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ──── 0x7fffd7e03599 <quantize_cpu(float*,+0> movss DWORD PTR [rax], xmm0 0x7fffd7e0359d <quantize_cpu(float*,+0> mov rax, QWORD PTR [rbp-0x100] 0x7fffd7e035a4 <quantize_cpu(float*,+0> cqo → 0x7fffd7e035a6 <quantize_cpu(float*,+0> idiv QWORD PTR [rbp-0xf8] 0x7fffd7e035ad <quantize_cpu(float*,+0> mov QWORD PTR [rbp-0x40], rax 0x7fffd7e035b1 <quantize_cpu(float*,+0> mov rax, QWORD PTR [rbp-0x100] 0x7fffd7e035b8 <quantize_cpu(float*,+0> cqo 0x7fffd7e035ba <quantize_cpu(float*,+0> idiv QWORD PTR [rbp-0xf8] 0x7fffd7e035c1 <quantize_cpu(float*,+0> mov rax, rdx [!] Command 'context' failed to execute properly, reason: 'threads' gef➤ bt #0 0x00007fffd7e035a6 in quantize_cpu(float*, float*, float*, unsigned char*, long long, long long) () from /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cpu.so #1 0x00007fffd7e05309 in cquantize_blockwise_cpu_fp32 () from /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cpu.so
execute normally without any crash
I can reproduce this behavior. We get a division by zero because blocksize is a 64-bit long long and overflows.
blocksize
long long
Is there a practical reason or need for blocksize that large?
System Info
Distributor ID: Ubuntu Description: Ubuntu 22.04.4 LTS Release: 22.04 Codename: jammy Python 3.10.12
Reproduction
crash backtrace
Expected behavior
execute normally without any crash