Closed allanj closed 1 year ago
https://github.com/huggingface/transformers-bloom-inference/blob/7bea3526d8270b4aeeefecc57d7d7d638e2bbe0e/bloom-inference-scripts/bloom-ds-inference.py#L121-L137
In this code the first argument in with deepspeed.OnDevice(dtype=dtype, device="meta"): we use float16, while
with deepspeed.OnDevice(dtype=dtype, device="meta"):
float16
model = AutoModelForCausalLM.from_config(config, torch_dtype=torch.bfloat16)
here we use bfloat16.
bfloat16
I wonder why we should have this inconsistency?
@allanj it doesn't really matter. when specifying meta, it only does memory allocation with empty tensors and both are 16-bit.
meta
https://github.com/huggingface/transformers-bloom-inference/blob/7bea3526d8270b4aeeefecc57d7d7d638e2bbe0e/bloom-inference-scripts/bloom-ds-inference.py#L121-L137
In this code the first argument in
with deepspeed.OnDevice(dtype=dtype, device="meta"):
we usefloat16
, whilehere we use
bfloat16
.I wonder why we should have this inconsistency?