coreos / torus

Torus Distributed Storage
https://coreos.com/blog/torus-distributed-storage-by-coreos.html
Apache License 2.0
1.77k stars 172 forks source link

need soft limits for volume quota and storage node's data file #401

Open nak3 opened 7 years ago

nak3 commented 7 years ago

version

issue

Since there are no soft limits for the volume quota and torusd's data file, it is difficult to notice the out of space.

For example, I have been running postgres on k8s with torus, then they got down suddenly, even though torus pod was running like below:

$ kubectl get pod
NAME                               READY     STATUS    RESTARTS   AGE
etcd-torus                         1/1       Running   1          2d
postgres-torus-1616493773-njtjp    0/1       Error     0          1d
postgres-torus-1616493773-njtjp2   0/1       Error     0          1d
torus-snqwd                        1/1       Running   1          2d

After checked torus log, it found that storage node's diskpace was full.

$ kubectl logs torus-snqwd
...
2016-11-19 19:38:01.432298 E | storage: mfile: out of space
2016-11-19 19:38:01.455505 E | storage: mfile: out of space
2016-11-19 19:38:01.575292 E | storage: mfile: out of space

$ kubectl get event
FIRSTSEEN   LASTSEEN   COUNT     NAME                               KIND      SUBOBJECT   TYPE      REASON        SOURCE               MESSAGE
1d          32s        5598      postgres-torus-1616493773-njtjp    Pod                   Warning   FailedMount   {kubelet fed-node}   Unable to mount volumes for pod "postgres-torus-1616493773-njtjp_default(73b274b2-ae3d-11e6-9746-5254007c7544)": attach command failed, status: Failure, reason: Couldn't attach
1d          32s        5598      postgres-torus-1616493773-njtjp    Pod                   Warning   FailedSync    {kubelet fed-node}   Error syncing pod, skipping: attach command failed, status: Failure, reason: Couldn't attach