Memory and Latency Regression Tests

argmaxinc / WhisperKit

On-device Speech Recognition for Apple Silicon

https://takeargmax.com/blog/whisperkit

MIT License

3.17k stars 268 forks source link

Memory and Latency Regression Tests #99

Closed Abhinay1997 closed 5 months ago

Abhinay1997 commented 5 months ago

Log memory during regression test
Keep track of latency during regression test
Summarize measurements and save for every N(=100) samples
Dump measurements to a json for easy visualization.

Checklist:

[x] code clean up
[x] add more stats for latency tests ~~- [ ] add WER calculations in swift~~ Moved out of this PR. Plan to include it as part of EN Normalization PR
[x] get processor info and add it to test output. See Utils.swift
[x] use a longer audio file. > 1 hr for tests. Using earnings22
[x] test dynamically on all models similar to testOutputAll

See colab for plots

ZachNagengast commented 5 months ago

We also need a scheme to detect regressions for these tests on a chip-by-chip basis. We can store the data inside the test resources directly, or host them in a huggingface repo.

Files should be named something like this: Resources/Fixtures/RegressionTests/\(Process.processor)/\(modelName)/\(date).json

All of these should be at or below 1.05x of the baseline data. I.e. a given regression test can be up to 5% slower than the baseline to account for various hardware fluctuations.

Word error rate
Peak memory
Tokens per second
Full pipeline (for model loading times)

We don't yet have code to calculate WER from swift, but from python we use this metric: https://huggingface.co/spaces/evaluate-metric/wer

Should also have a test that runs all available models from the whisper-coreml repo, similar to testOutputAll, however I wouldn't expect this to run on the github runners due to the limited resources they have, it will just be run manually before releases.

Abhinay1997 commented 5 months ago

WER calculations are detailed here, we'll have to implement this in Swift.
Chip/Processor info is possible to obtain. Shouldn't be an issue. I'll add that.
Understood on the file name ! Let me do that.

atiorh commented 5 months ago

Should resolve #61

Abhinay1997 commented 5 months ago

You'll need to install xcparse via brew install chargepoint/xcparse/xcparse

xcodebuild clean build-for-testing -scheme whisperkit-Package -destination generic/platform=macOS | xcpretty
xcodebuild test -only-testing WhisperKitTests/RegressionTests -scheme whisperkit-Package -destination "platform=macOS,arch=arm64" -resultBundlePath ~/Downloads
xcparse attachments ~/Downloads/<latest_xc_result_file>.xcresult

Note: xcparse command above will output attachments as files in current directory

Abhinay1997 commented 5 months ago

Do let me know how I can improve on the linting !

As for running it, the commands above should work, if not, you can manually run it from XCode and see the test attachments in the Xcode test result.

ZachNagengast commented 5 months ago

Do let me know how I can improve on the lining !

Will have linting rules setup soon, until then this is good to merge 👍