Open BrandonWeng opened 1 week ago
@BrandonWeng Is there a specific feature you're looking for in the latest version? We upgraded to the point right before jinja was added to avoid another dependency that we don't have much use for, but would consider upgrading there's a need.
I think we found a way around this.
We were trying to get MLX running, but the examples required > 0.1.12
I'll just leave this here: https://github.com/argmaxinc/WhisperKit/pull/249
Happy to close the issue + PR if you don't think that its necessary. Just wanted to leave it here in case other folks run into the same issue. Spent several hours trying to work around it but this turned out to be the simplest solution for us
Thanks! Curious to hear more about the approach you're taking with MLX, we have a PR in progress that still needs a couple perf improvements #200
Unfortunately, I'm pretty new to Swift and its ecosystem as a whole. I'm just trying out a bunch of different things right now. Will report back once I have a better understanding!
For now, I've only been comparing the mlx models, the quantized models use significantly less memory:
mlx-community/Llama-3.2-1B-Instruct-bf16
uses around 2.5GB of memory and mlx-community/Llama-3.2-1B-Instruct-8bit
is around 1.5GB. Performance wise bf16 isn't too far off from 8bit
Hey folks, was wondering what it would take to upgrade
swift-transformer
to the latest version?Apologies, totally new to Swift.. Happy to make the PR if there's no known blockers