[Whisper] convert with FP32 running failed

Jay19751103 commented 2 weeks ago

Describe the bug A clear and concise description of what the bug is. Converted an mp3 file or wav file to text failed.

To Reproduce Steps to reproduce the behavior. python test_transcription.py --audio_path d:\disc\converted.mp3 --predict_timestamps --language Japanese --config whispe r_cpu_fp32.json Expected behavior A clear and concise description of what you expected to happen. Translate output text of audio file

Olive config Add Olive configurations here.

Olive logs Add logs here. 2024-06-21 15:37:47.9853878 [E:onnxruntime:, sequential_executor.cc:516 onnxruntime::ExecuteKernel] Non-zero status code returned while running ConstantOfShape node. Name:'/ConstantOfShape' Status Message: C:\a_work\1\s\onnxruntime\core\framework\op_kernel.cc:83 onnxruntime::OpKernelContext::OutputMLValue status.IsOK() was false. Tensor shape cannot contain any negative value

Other information

OS: [e.g. Windows, Linux]
Olive version: [e.g. 0.4.0 or main]
ONNXRuntime package and version: [e.g. onnxruntime-gpu: 1.16.1]

Additional context Add any other context about the problem here.

trajepl commented 2 weeks ago

Have you tried to upgrade the verion of Olive and onnxruntime?

Jay19751103 commented 1 week ago

Have you tried to upgrade the verion of Olive and onnxruntime?

I use tool to cut the audio file from 48s shorter than 30s. it can run now.

jambayk commented 1 week ago

Yes, 30 seconds is the maximum length of audio the whisper model can handle. Please refer to a comment I made in another related issue https://github.com/microsoft/Olive/issues/1108#issuecomment-2075824364

microsoft / Olive

[Whisper] convert with FP32 running failed #1207