[punet] Update quantizer to allow explicit mixed precision rescale.

Previously, the dtype of the scale dictated the output dtype after dequant. This makes it impossible to execute in low-p (i.e. fp16) while preserving rescale computation in high-p (i.e. fp32). The latter is needed to avoid integer->float overflow after integer arithmetic. (There are other ways to factor this to avoid higher precision scales but this is simple/standard)

Also enables quantized bias, since this avoids the overflow and fixes an unnecessarily large eps.

nod-ai / sharktank

[punet] Update quantizer to allow explicit mixed precision rescale. #51