Closed matthewdouglas closed 2 months ago
This is a companion PR for https://github.com/huggingface/transformers/pull/32276 to allow us to load prequantized weights with alternate storage. We keep track of metadata we need the same way we would with Params4bit.__new__ after PR #970.
Params4bit.__new__
This works with models exported with a non-default quant_storage such as this one in NF4 with BF16 storage.
quant_storage
@Titus-von-Koeller @winglian
This is a companion PR for https://github.com/huggingface/transformers/pull/32276 to allow us to load prequantized weights with alternate storage. We keep track of metadata we need the same way we would with
Params4bit.__new__
after PR #970.This works with models exported with a non-default
quant_storage
such as this one in NF4 with BF16 storage.@Titus-von-Koeller @winglian