WukLab / preble

Stateful LLM Serving
Apache License 2.0
32 stars 4 forks source link

Add Load based Memory Eviction #44

Closed vikranth22446 closed 6 months ago

vikranth22446 commented 6 months ago

Consider using a prioritiy queue for each node. Then evict based on that that prioritiy queue. This should handle memory much better in terms of sharing