ulab-uiuc / AGI-survey

MIT License
321 stars 20 forks source link

Classification of LLM training and inference efficiency #2

Closed Monstertail closed 5 months ago

Monstertail commented 5 months ago

classification

Hi Jingyu, I wonder whether there is something not so accurate in this classification. There are many papers about KV cache optimizations/compression on the inference side(at least 5 new papers each month) rather than training; Also, there are many papers about memory management in the LLM serving systems (e.g. vLLM, SGLang).

I feel like the training is calculation-bound while the inference is memory-bound. Although there are some papers like Galore to make memory-efficient pretraining/fine-tuning, I still think memory management is more important(at least the same importance) on the inference side.

What do you think?

@Jingyu6

Jingyu6 commented 5 months ago

Hi @Monstertail,

Thanks for pointing out! I think what you said about training & inference classification makes sense and I will adjust either in this version or the next. However, the list here in the repo is not exhausted (only a subset of whats included in the paper) just to make it concise. I think it is more important to keep the most essential works in the repo here although we will later update more recent works in the paper. Thanks for your suggestions.

Best, Jingyu