k3s-io / kine

Run Kubernetes on MySQL, Postgres, sqlite, dqlite, not etcd.
Apache License 2.0
1.5k stars 228 forks source link

Requests with invalid revisions are fulfilled #215

Closed rmweir closed 4 months ago

rmweir commented 9 months ago

Requests with improper revisions are fulfilled by returning latest. This is observed when using a distro on kine and requests an arbitrary resource version, i.e. with a default rancher-desktop setup. This happens even where ResourceVersionMatch is set to Exact. The response is presumedly the latest version but the revision is set as the one in the request. This offers no feedback regarding the invalid request. The offending code according to @brandond is probably here https://github.com/k3s-io/kine/blob/master/pkg/logstructured/logstructured.go#L181-L182.

brandond commented 8 months ago

There are actually a couple errors here:

This may or may not be an issue depending on how Kubernetes makes its requests. I think Kubernetes only handles the compact error on lists anyway, so it's probably fine, if confusing.

If I insert a bunch of rows, such that all previous revisions are compacted, getting the old revision responds as if the key does not exist

brandond@dev01:~$ etcdctl --endpoints=http://172.17.0.2:2379 -w json get /x --rev=3
{"header":{"revision":3}}
brandond@dev01:~$ etcdctl --endpoints=http://172.17.0.2:2379 -w json get /x --rev=4
{"header":{"revision":4},"kvs":[{"key":"L3g=","create_revision":3,"mod_revision":4,"value":"Ng=="}]}

For example, testing a key created as rev=3 and updated once as rev=4, the following is seen:

brandond@dev01:~$ etcdctl --endpoints=http://172.17.0.2:2379 -w json get /x --rev=2
{"header":{"revision":2}}
brandond@dev01:~$ etcdctl --endpoints=http://172.17.0.2:2379 -w json get /x --rev=3
{"header":{"revision":3},"kvs":[{"key":"L3g=","create_revision":3,"mod_revision":3,"value":"NQ=="}]}
brandond@dev01:~$ etcdctl --endpoints=http://172.17.0.2:2379 -w json get /x --rev=4
{"header":{"revision":4},"kvs":[{"key":"L3g=","create_revision":3,"mod_revision":4,"value":"Ng=="}]}
brandond@dev01:~$ etcdctl --endpoints=http://172.17.0.2:2379 -w json get /x --rev=5
{"header":{"revision":5},"kvs":[{"key":"L3g=","create_revision":3,"mod_revision":4,"value":"Ng=="}]}
brandond@dev01:~$ etcdctl --endpoints=http://172.17.0.2:2379 -w json get /x --rev=9999
{"header":{"revision":9999},"kvs":[{"key":"L3g=","create_revision":3,"mod_revision":4,"value":"Ng=="}]}

The current max revision here is 4, anything newer that that should return an error.

brandond commented 8 months ago

Kine also appears to leave the count field empty when responding to a GET, but Kubernetes doesn't care because it is only expecting 1 record back, and just checks the length of the list.

kine:

brandond@dev01:~$ etcdctl --endpoints=http://172.17.0.2:2379 -w json get /x --rev=0
{"header":{"revision":4},"kvs":[{"key":"L3g=","create_revision":3,"mod_revision":4,"value":"Ng=="}]}

etcd:

brandond@dev01:~$ etcdctl --endpoints=http://172.17.0.2:2379 -w json get /x --rev=0
{"header":{"cluster_id":14841639068965178418,"member_id":10276657743932975437,"revision":5,"raft_term":2},"kvs":[{"key":"L3g=","create_revision":2,"mod_revision":5,"version":4,"value":"NQ=="}],"count":1}

Kine also does the wrong thing when asking for a compact-but-latest revision. The object will still exist at that revision because it is the most recent revision, but it should be impossible to retrieve anything at that revision because it's been compacted. You have to ask for a revision that hasn't been compacted.

For example, when compacting etcd up to revision 1000:

brandond@dev01:~$ etcdctl --endpoints=http://172.17.0.2:2379 -w json get /x --rev=0
{"header":{"cluster_id":14841639068965178418,"member_id":10276657743932975437,"revision":1106,"raft_term":2},"kvs":[{"key":"L3g=","create_revision":2,"mod_revision":5,"version":4,"value":"NA=="}],"count":1}

brandond@dev01:~$ etcdctl --endpoints=http://172.17.0.2:2379 -w json get /x --rev=5
{"level":"warn","ts":"2023-10-27T00:32:12.819Z","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"endpoint://client-1f5ec387-9218-4caf-95c2-0495767445fb/172.17.0.2:2379","attempt":0,"error":"rpc error: code = OutOfRange desc = etcdserver: mvcc: required revision has been compacted"}
Error: etcdserver: mvcc: required revision has been compacted

brandond@dev01:~$ etcdctl --endpoints=http://172.17.0.2:2379 -w json get /x --rev=1000
{"header":{"cluster_id":14841639068965178418,"member_id":10276657743932975437,"revision":1106,"raft_term":2},"kvs":[{"key":"L3g=","create_revision":2,"mod_revision":5,"version":4,"value":"NA=="}],"count":1}
brandond commented 8 months ago

tl;dr we should:

I think that would fix everything.

brandond commented 8 months ago

Validation in k3s with these changes applied. Note that revisions higher than the current revision shown in the header can no longer be retrieved with get. get and watch now behave consistently for compacted revisions.

/ # etcdctl --endpoints=unix:///var/lib/rancher/k3s/server/kine.sock -w json get /registry/health
{"header":{"revision":14678},"kvs":[{"key":"L3JlZ2lzdHJ5L2hlYWx0aA==","create_revision":2,"mod_revision":2,"value":"eyJoZWFsdGgiOiJ0cnVlIn0="}],"count":1}

/ # etcdctl --endpoints=unix:///var/lib/rancher/k3s/server/kine.sock -w json watch /registry/health --rev 2
watch was canceled (etcdserver: mvcc: required revision has been compacted)
{"Header":{"revision":14687},"Events":[],"CompactRevision":13604,"Canceled":true,"Created":false}
Error: watch is canceled by the server

/ # etcdctl --endpoints=unix:///var/lib/rancher/k3s/server/kine.sock -w json get /registry/health --rev 2
Error: etcdserver: mvcc: required revision has been compacted

/ # etcdctl --endpoints=unix:///var/lib/rancher/k3s/server/kine.sock -w json get /registry/health --rev 15000
Error: etcdserver: mvcc: required revision is a future revision