huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.11k stars 26.32k forks source link

Whisper Inference in BF16 precision. #29475

Open Aditya-Scalers opened 6 months ago

Aditya-Scalers commented 6 months ago

System Info

A google colab instance.

Who can help?

@sanchit-gandhi

Information

Tasks

Reproduction

My sample code:

import torch
from transformers import pipeline

device = "cpu"

pipe = pipeline(
  "automatic-speech-recognition",
  model = "openai/whisper-base",
  device=device,
  torch_dtype = torch.bfloat16,
)

sample = "/content/sample.mp3"

prediction = pipe(sample)["text"]
print(prediction)

Expected behavior

I used the sample pipeline code from the huggingface whisper samples and in the dtype argument for the pipeline i wanted the inference to run in BF16 precision. Below is the error i encountered.

RuntimeError: Input type (torch.FloatTensor) and weight type (CPUBFloat16Type) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

amyeroberts commented 3 months ago

Gentle ping @sanchit-gandhi

amyeroberts commented 2 months ago

another ping @sanchit-gandhi

amyeroberts commented 2 months ago

cc @kamilakesbi

amyeroberts commented 1 month ago

cc @ylacombe

ylacombe commented 2 weeks ago

I can't find the right PR, but it seems to have be solved when I look at the pipeline code. There's been a fix tentative in #29486 but again, seems to have been fixed looking directly at the code!

Besides, when running your code snippet, everything seems to work fine! Can you try again on the latest transformers version and confirm it works as expected!

Also cc @eustlb for visibilty