Why is the inference speed using fp16 or bf16 similar to fp32?

loubnabnl / santacoder-finetuning

Fine-tune SantaCoder for Code/Text Generation.

Apache License 2.0

184 stars 23 forks source link

Open lionday opened 1 year ago

lionday commented 1 year ago

Is there any specific configuration method? model = AutoModelForCausalLM.from_pretrained(checkpoint,trust_remote_code=True,torch_dtype=torch.float16)

loubnabnl commented 1 year ago

can you give more details on how you're doing generation? fp16 and bf16 are it's faster in my experiments