Closed Abhinay1997 closed 5 months ago
We also need a scheme to detect regressions for these tests on a chip-by-chip basis. We can store the data inside the test resources directly, or host them in a huggingface repo.
Files should be named something like this:
Resources/Fixtures/RegressionTests/\(Process.processor)/\(modelName)/\(date).json
All of these should be at or below 1.05x of the baseline data. I.e. a given regression test can be up to 5% slower than the baseline to account for various hardware fluctuations.
We don't yet have code to calculate WER from swift, but from python we use this metric: https://huggingface.co/spaces/evaluate-metric/wer
Should also have a test that runs all available models from the whisper-coreml repo, similar to testOutputAll
, however I wouldn't expect this to run on the github runners due to the limited resources they have, it will just be run manually before releases.
Should resolve #61
You'll need to install xcparse via brew install chargepoint/xcparse/xcparse
xcodebuild clean build-for-testing -scheme whisperkit-Package -destination generic/platform=macOS | xcpretty
xcodebuild test -only-testing WhisperKitTests/RegressionTests -scheme whisperkit-Package -destination "platform=macOS,arch=arm64" -resultBundlePath ~/Downloads
xcparse attachments ~/Downloads/<latest_xc_result_file>.xcresult
Note: xcparse command above will output attachments as files in current directory
Do let me know how I can improve on the linting !
As for running it, the commands above should work, if not, you can manually run it from XCode and see the test attachments in the Xcode test result.
Do let me know how I can improve on the lining !
Will have linting rules setup soon, until then this is good to merge 👍
Checklist:
- [ ] add WER calculations in swiftMoved out of this PR. Plan to include it as part of EN Normalization PRUtils.swift
testOutputAll
See colab for plots