huggingface / candle

Minimalist ML framework for Rust
Apache License 2.0
15.39k stars 907 forks source link

Falcon implementation issues #2065

Open jorgeantonio21 opened 5 months ago

jorgeantonio21 commented 5 months ago

It seems that clearing cache on current Falcon model implementation is currently not working properly. Every time a second query is run, the cache is not cleared.

LaurentMazare commented 5 months ago

Actually there was no way to flush the kv cache in falcon, I've added a function for this in #2066 , you should call it on further queries.

jorgeantonio21 commented 5 months ago

Thanks @LaurentMazare !