FasterDecoding / SnapKV

141 stars 4 forks source link

observation window size and consistency between layers #17

Open Cooperx521 opened 2 weeks ago

Cooperx521 commented 2 weeks ago

Hello :)

Thank you for the brilliant work and for sharing your code. After reading the paper and reviewing the related code, I have the following questions:

  1. Have you conducted experiments related to the observation window size (e.g., sizes ranging from 1 to 64)? How does this impact the hit rates and overall model performance?
  2. In the "layer-wise average hit rate" experiment, the hit rate of the middle layers is significantly lower than that of the shallow and deep layers. Do you know the reason for this?

Thank you for your excellent paper!