octoml / mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
https://mlc.ai/mlc-llm
Apache License 2.0
5 stars 8 forks source link

[Tracking] Sampler optimization #199

Open masahi opened 9 months ago

masahi commented 9 months ago

Let's collect remaining issues we are aware of related to sampler performance

masahi commented 9 months ago

The first issue seems to have been fixed by @vvchernov https://github.com/octoml/mlc-llm/pull/215

vvchernov commented 9 months ago

Hello @masahi! No, my fix in #215 resolved very strong (more than one order) reduction after #214. About task 1: 1. we observed reduction ~25-30% after #192 2. It was not resolved, I'm investigating the issue About task 2: I remember about logprobs, but looks like resolving of task 1 requires sampler refactor and I want to do it first (or somebody will do it)