The implementation is based on the AudioEncoder from the mlx-examples repository.
To make sure the audio encoder works as expected, I have added the weights loading functionality. The weights are taken from https://huggingface.co/jkrukowski/whisper-tiny-mlx-safetensors repository. This repository contains the weights for the whisper-tiny-mlx model transformed to the safetensors format (for now MLX Swift does not have the ability to load .npz files). I have added the MLX weights download functionality to Makefile to make sure the tests are run correctly. This could be removed in the future once the MLX branch is fully integrated into the main repository and we decide on the best way to handle the weights.
I have changed the project structure a bit. I have moved the common test utilities to the WhisperKitTestsUtils target. The resources used for testing (audio files and models) are moved there as well. This way we can reuse resources in both, MLX and non-MLX tests. Additionally, it simiplifies the project structure a bit -- WhisperKitTests target no longer has to have the custom path and bunch of excluded files.
This PR adds MLX Audio Encoder
The implementation is based on the
AudioEncoder
from themlx-examples
repository.To make sure the audio encoder works as expected, I have added the weights loading functionality. The weights are taken from https://huggingface.co/jkrukowski/whisper-tiny-mlx-safetensors repository. This repository contains the weights for the
whisper-tiny-mlx
model transformed to thesafetensors
format (for now MLX Swift does not have the ability to load.npz
files). I have added the MLX weights download functionality toMakefile
to make sure the tests are run correctly. This could be removed in the future once the MLX branch is fully integrated into the main repository and we decide on the best way to handle the weights.I have changed the project structure a bit. I have moved the common test utilities to the
WhisperKitTestsUtils
target. The resources used for testing (audio files and models) are moved there as well. This way we can reuse resources in both, MLX and non-MLX tests. Additionally, it simiplifies the project structure a bit --WhisperKitTests
target no longer has to have the custom path and bunch of excluded files.