Open photomz opened 7 months ago
Hello!
This is not the official repo for the paper, but rather work by an inspired fan :) If you would like the paper authors to see this, consider opening an issue in https://github.com/mit-han-lab/streaming-llm
I'm personally not quite sure which tokens in particular they used as placeholder tokens, but it is very possible that it's just regular tokens that they "consider" as sink tokens by not letting the window cache discard them.
Kudos to authors for open-sourcing a practical LLM chat improvement so quickly. In the preprint's Section 3.3, you experiment with:
Specifically, how do you add a "sink token"? Is it functionally different from GPT's \<startoftext> or Llama2's [BOS] tokens? Is the logic unique? Releasing a code snippet of the Sink Token training would be great, thanks.