Whisper Inference in BF16 precision.

Aditya-Scalers commented 6 months ago

System Info

A google colab instance.

Who can help?

@sanchit-gandhi

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

My sample code:

import torch
from transformers import pipeline

device = "cpu"

pipe = pipeline(
  "automatic-speech-recognition",
  model = "openai/whisper-base",
  device=device,
  torch_dtype = torch.bfloat16,
)

sample = "/content/sample.mp3"

prediction = pipe(sample)["text"]
print(prediction)

Expected behavior

I used the sample pipeline code from the huggingface whisper samples and in the dtype argument for the pipeline i wanted the inference to run in BF16 precision. Below is the error i encountered.

RuntimeError: Input type (torch.FloatTensor) and weight type (CPUBFloat16Type) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

amyeroberts commented 3 months ago

Gentle ping @sanchit-gandhi

amyeroberts commented 2 months ago

another ping @sanchit-gandhi

amyeroberts commented 2 months ago

cc @kamilakesbi

amyeroberts commented 1 month ago

cc @ylacombe

ylacombe commented 2 weeks ago

I can't find the right PR, but it seems to have be solved when I look at the pipeline code. There's been a fix tentative in #29486 but again, seems to have been fixed looking directly at the code!

Besides, when running your code snippet, everything seems to work fine! Can you try again on the latest transformers version and confirm it works as expected!

Also cc @eustlb for visibilty

huggingface / transformers