intelligent-machine-learning / glake

GLake: optimizing GPU memory management and IO transmission.
Apache License 2.0
351 stars 32 forks source link

Do vtensor need 64K/128K physical memory policy? #24

Open nalinaly opened 1 month ago

nalinaly commented 1 month ago

vAttention said that: if use 2M pageSize, 128M physical memory can be wasted per-request in the worst-case in Llama-3-8B (TP-1), but if use 64KB, 128M would be only 4M Do vtensor have the same problem? Will vtensor integrate 64K/128K pageSize in the future?