Help debug a suspected precision issue

tqtifnypmb commented 7 months ago

Hi

I'm using swift-mlx to implement the Whisper model. The encoder runs on CoreML, and the decoder runs on MLX. Everything works fine for the tiny, tiny.en, base and base.en, small.en models. However, I encountered some strange issues with other models:

1) For the small model, whenever I include a prompt, the decoder's output becomes abnormal. If I remove the prompt, the output becomes normal again. This issue only occurs with the small model.

2) Both the small and medium models have problems transcribing languages other than English.

Since all other variables are the same, I wanted to ask if the size of the model could be causing precision issues in MLX calculations?

Thanks

awni commented 7 months ago

This doesn't sound like a precision issue to me. Have you tried the same prompt / audio in the Python MLX Whisper example? If it works there then either:

You have a bug in your implementation
There is a bug in the Swift front-end. If you think that's the case, we'll need more details to reproduce and help debug.

If it doesn't work in the Python example, then we'll need to investigate further. (presumably you've tested this in the original code Whisper). In this case maybe you could provide the input and expected output and steps to reproduce.

tqtifnypmb commented 7 months ago

After comparing output of MLX Whisper and my implementation part by part, this issue was caused by precision losses outside MLX.

ml-explore / mlx-swift

Help debug a suspected precision issue #70