substratusai / runbooks

Finetune LLMs on K8s by using Runbooks
https://www.substratus.ai
Other
168 stars 14 forks source link

notebooks should not have ephemeral storage limit #246

Open samos123 opened 12 months ago

samos123 commented 12 months ago
  Warning  Evicted              60s                kubelet            Pod ephemeral local storage usage exceeds the total limit of containers 100Gi.

This can be frustating because in many cases the node had more than 100Gi ephemeral storage available and I lose all my notebook progress when this happens

samos123 commented 11 months ago

This turned out to be a K8s bug that if one of the sidecars sets a limit then all containers inherit that limit