If torch_dtype is set and is not "auto" we set bnb_4bit_compute_dtype to this.
No reason but intuition, but I expect that being able to align compute_dtype to whatever type the model was loaded in (assuming this is the same type it was trained using) should mitigate any dynamic range issues.
Optionally could have another arg to explicitly set the bnb compute type, but that seems excessive as I can't think of any case where this would be necessary, and just adds another arg.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
If
torch_dtype
is set and is not"auto"
we setbnb_4bit_compute_dtype
to this.No reason but intuition, but I expect that being able to align
compute_dtype
to whatever type the model was loaded in (assuming this is the same type it was trained using) should mitigate any dynamic range issues.Optionally could have another arg to explicitly set the bnb compute type, but that seems excessive as I can't think of any case where this would be necessary, and just adds another arg.