Does GPTQ reduce to OBQ if I set block size to 1?

The blocksize has no effect on the function of the GPTQ algorithm, it merely effects the efficiency of execution on a GPU by batching together updates (though there may be some numerical artifacts in practice). In general, the relation of GPTQ to OBQ is that it uses the same update formulas but applies quantization in the same fixed order across all matrix rows, whereas OBQ separately quantizes each row in the order of (dynamically determined) quantization difficulty, which is generally different between rows. This makes GPTQ dramatically more efficient that OBQ and allows scaling to extremely large models.

IST-DASLab / gptq

Does GPTQ reduce to OBQ if I set block size to 1? #25