mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
https://arxiv.org/abs/2309.17453
MIT License
6.38k stars 355 forks source link

Enable explictly setting transformer model cache #56

Open JiaxuanYou opened 9 months ago

JiaxuanYou commented 9 months ago

Allow users to explicitly set where the downloaded transformer model (from Huggingface) should be saved. This is useful when the default ~/.cache folder is linked to an unexpected path, e.g, when using a docker container.