Open lcw99 opened 1 year ago
We've ran into the exact same error before: https://github.com/hyperonym/basaran/issues/5. The error is caused by https://github.com/TimDettmers/bitsandbytes/issues/162 and seems fully random.
Currently the only workaround is to stop using INT8
quantization, and use half-precision instead.
When I call multiple streaming completions at the same time I get the error below.