opengear-project / GEAR

GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM
MIT License
128 stars 10 forks source link

Questions about the code structure #10

Closed CUHKSZzxy closed 1 month ago

CUHKSZzxy commented 3 months ago

Thanks for your excellent work!

I am a little bit confused about the code structure and their usages:

GEARLM/GEARLM
    -- Simulated
    -- TrueCompression
        -- models
        -- old_models

What's the difference between Simulated and TrueCompression, and their usages respectively? And what's the difference between models and old_models?

Any help would be appreciated!

HaoKang-Timmy commented 2 months ago

We are now refining the algorithm and probably new results will come out next week with reproducable code.

shhn1 commented 2 months ago

@HaoKang-Timmy Hello, are there any updates? : )

HaoKang-Timmy commented 1 month ago

@HaoKang-Timmy Hello, are there any updates? : )

Yes, we have updated the new code. And baseline on arxiv will be changed accordingly