Open lbernail opened 4 years ago
@kubernetes/sig-cli-feature-requests @soltysh - for thoughts [I don't have strong preference about how much details we want to expose, but at the very high-level I'm supportive for this, given we have the ability to set RV in the k8s API.]
@lbernail your example assumes you are getting all 30k pods at once, which kubectl get
does not, by default it returns elements in chunks of 500 see https://github.com/kubernetes/kubernetes/blob/17312ea4a92a0bba31272a6709b37a88aa383b2d/staging/src/k8s.io/kubectl/pkg/cmd/get/get.go#L473, the flag being:
--chunk-size=500: Return large lists in chunks rather than all at once. Pass 0 to disable. This flag is beta and
may change in the future.
I don't have any strong opinions one way or the other, similarly to Wojtek :wink: but I could see such an addition which might be useful with --watch
capability which currently hard-codes 0, see https://github.com/kubernetes/kubernetes/blob/17312ea4a92a0bba31272a6709b37a88aa383b2d/staging/src/k8s.io/kubectl/pkg/cmd/get/get.go#L664 you'd need to add it here: https://github.com/kubernetes/kubernetes/blob/17312ea4a92a0bba31272a6709b37a88aa383b2d/staging/src/k8s.io/kubectl/pkg/cmd/get/get.go#L441 similarly to how we add full object when sorting is requested.
Thanks a lot @soltysh Even if the calls are paginated, they all make it to etcd right? So the total query time can probably only be higher? (and the impact on etcd greater?) Can apiserver server RV=0 paginated queries from the cache? (I think I read somewhere that paginated calls bypass the apiserver cache).
I'm discovering the includeObject option. How is it related to getting data from the apiserver cache? (sorry for the probably very basic question)
but I could see such an addition which might be useful with --watch capability which currently hard-codes 0,
I like this argument.
Can apiserver server RV=0 paginated queries from the cache?
watcache doesn't support pagination - so the limit param is ignored with RV=0
I'm discovering the includeObject option. How is it related to getting data from the apiserver cache? (sorry for the probably very basic question)
By default kubectl get
retrieves server-side printed information about a resource. It's a table with pre-defined columns that is then printed to stdout. When you request -oyaml
or anything that requires more data that table contains full definition of the object.
/triage accepted /priority backlog /help /good-first-issue
@lbernail were you still interested in working on this?
Hi, may I have a try?
@rudeigerc , yea go ahead, type /assign
so k8s-ci-robot will assign you
@lauchokyip Thanks a lot! π
/assign
I think that I could follow https://github.com/kubernetes/kubectl/issues/965#issuecomment-718070378 to make some updates.
My question is should --resource-version
only be imported when --watch
is active, or should it be imported to the whole kubectl get
? (Since when the resource version is specified, it is handled in func (o *GetOptions) watch(f cmdutil.Factory, cmd *cobra.Command, args []string) error
separately as mentioned above.)
My question is should
--resource-version
only be imported when--watch
is active, or should it be imported to the wholekubectl get
?
It should be exposed to kubectl get
in general. It basically enables stale reads at exact timestamps, we'd likely want this for both list and individual object get operations.
@logicalhan Got it. Thanks. π
is there any update?
@howardshaw I guess you can go ahead and give it a try
@howardshaw Sorry I ran out of time recently, please go ahead if you would like to.
I guess I will try tackling this one
/assign
I tried adding req.Param("resourceVersion", "x")
here (https://github.com/kubernetes/kubernetes/blob/17312ea4a92a0bba31272a6709b37a88aa383b2d/staging/src/k8s.io/kubectl/pkg/cmd/get/get.go#L441) where x
is any number > 0 to test. However, when I run kubectl get pods
it always output The resourceVersion for the provided list is too old.
If I do curl http://localhost:8080/api/v1/pods?resourceVersion="x"
it was able to output the PodList
. Is this behaviour expected? I am not sure how changing the req.Param
is different with using the curl
with resourceVersion
as the query
ok i will try
Chok Yip Lau notifications@github.com δΊ 2021εΉ΄3ζ5ζ₯ε¨δΊ 22:24ειοΌ
/assign
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubectl/issues/965#issuecomment-791450272, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHOVDA4FHZGTHQZFIK4MXTTCDSQBANCNFSM4S4QCLCQ .
@howardshaw , I am working on this one now but I will let you know if I can't solve it. Thank You.
I did some digging, the errors seem like it's coming from etcd (https://github.com/kubernetes/autoscaler/blob/57be08dbdcd4cd92e43f89310f2114e365d02624/cluster-autoscaler/vendor/k8s.io/apiserver/pkg/storage/etcd3/errors.go#L28). This is where v3rpc.ErrCompacted
(https://github.com/kubernetes/autoscaler/blob/57be08dbdcd4cd92e43f89310f2114e365d02624/cluster-autoscaler/vendor/go.etcd.io/etcd/clientv3/watch.go#L119) will be returned
I am assuming when the watch channel is run (https://github.com/kubernetes/autoscaler/blob/57be08dbdcd4cd92e43f89310f2114e365d02624/cluster-autoscaler/vendor/k8s.io/apiserver/pkg/storage/etcd3/watcher.go#L154) , this line (https://github.com/kubernetes/autoscaler/blob/57be08dbdcd4cd92e43f89310f2114e365d02624/cluster-autoscaler/vendor/k8s.io/apiserver/pkg/storage/etcd3/watcher.go#L242) is returning the error
You have you use a recent RV. You can't start at 1, it'll almost never work. What I would do is I would make a list call against any arbitrary resource. Doing so returns a ListRV. This is equivalent to etcd's current internal global counter. You can then use that RV to make subsequent requests. That RV will work for X minutes where is X is equal to the compaction interval set on the kube-apiserver. Every X minutes, the apiserver will make a call to compact etcd's database which will truncate old RVs. Eventually every RV will be compacted, so no RV is useable permanently (unless you disable compaction but then you will run into other issues).
Will start working on it soon. Reading the Kubernetes API docs now :)
I synced up with @lauchokyip and have a few thoughts/questions.
Given this ResourceVersion Parameter table:
--cached
flag (better name?) makes sense to me for kubectl get
./remove-help /remove-good-first-issue
- Some sort of
--cached
flag (better name?) makes sense to me forkubectl get
.
Personally I would find that pretty confusing. It is possible to issue an exact read at an exact RV and get back a strongly consistent read, if nothing has changed. Specifying a specific resourceVersion is tantamount to saying I want this object at that revision, irregardless of what has or hasn't happened in the storage layer.
I don't really have an opinion on whether you'd want to pass --resourceVersion=0 or --cached. I'm more interested in the usability of RV's > 0.
Eddie Zaneski
On Thu, Mar 18, 2021, 7:57 PM Han Kang @.***> wrote:
- Some sort of --cached flag (better name?) makes sense to me for kubectl get.
Personally I would find that pretty confusing. It is possible to issue an exact read at an exact RV and get back a strongly consistent read, if nothing has changed. Specifying a specific resourceVersion is tantamount to saying I want this object at that revision, irregardless of what has or hasn't happened in the storage layer.
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubectl/issues/965#issuecomment-802458566, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHKIVPJY554O5OR3E73WM3TEKVRVANCNFSM4S4QCLCQ .
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
/remove-lifecycle rotten
/reopen
@logicalhan: Reopened this issue.
Personally, I would like this feature.
@logicalhan , would it make sense to just implement on kubectl get --watch
based on this comment (https://github.com/kubernetes/kubectl/issues/965#issuecomment-718070378)??
I discussed with @eddiezane , we actually have some questions which is stopping us to move forward.
1) For normal user, there is no obvious way to get RV from kubectl
so what value would the user has to put to make it useful if the user wants to use kubectl get --watch <RV>
?
2) If we implement the RV outside of kubectl get --watch
, we would need an exact RV based on the RV Parameter Table because kubectl
will by default append the chunk size/limit 500 to the query string.
lau@debian:~ $ kubectl get pods -v=6
I0816 23:11:24.760352 19097 loader.go:372] Config loaded from file: /home/lau/.kube/config
I0816 23:11:25.799065 19097 round_trippers.go:454] GET https://127.0.0.1:6443/api/v1/namespaces/default/pods?limit=500 200 OK in 998 milliseconds
NAME READY STATUS RESTARTS AGE
personal-website-deployment-798f5b7c8c-lkkgv 1/1 Running 0 46d
personal-website-deployment-798f5b7c8c-bfb95 1/1 Running 0 46d
personal-website-deployment-798f5b7c8c-bsrcx 1/1 Running 0 46d
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
/reopen
@heipepper: You can't reopen an issue/PR unless you authored it or you are a collaborator.
Is there any update? Can you reopen this issue? @logicalhan
We decided there is no helpful use case for this issue so it remains closed for this reason On Aug 23, 2022, 09:12 -0400, Jianbo Ma @.***>, wrote:
Is there any update? Can you reopen this issue? @logicalhan β Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
We decided there is no helpful use case for this issue so it remains closed for this reason β¦ On Aug 23, 2022, 09:12 -0400, Jianbo Ma @.>, wrote: Is there any update? Can you reopen this issue? @logicalhan β Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.>
I think this function will be helpful when use kubectl get resources on large clusters. At this time, the list action will bring huge load to etcd and apiserver, etcd will return all data evenif we use labelSelecor or fieldSelector. Maybe we just need a simple flag to make the action read from cache by setting RV=0.
/reopen /remove-lifecycle rotten
@logicalhan: Reopened this issue.
This addition would be helpful if you are watching a very small set of known pods which you can GET
by RV and then immediately start watching from that RV, since you don't have to do the entire list.
This addition would be helpful if you are watching a very small set of known pods which you can
GET
by RV and then immediately start watching from that RV, since you don't have to do the entire list.
I'll try it. Since I'm not familiar with this code base, it may take some time
I would suggest bringing up this issue during sig cli meeting or you would end up modifying and not getting accepted for your PR. Thanks On Aug 23, 2022, 11:29 -0400, Jianbo Ma @.***>, wrote:
This addition would be helpful if you are watching a very small set of known pods which you can GET by RV and then immediately start watching from that RV, since you don't have to do the entire list. I'll try it. Since I'm not familiar with this code base, it may take some time β Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
What would you like to be added:
Support for
--resource-version=X
forkubectl get
(or maybe or simpler flag to just force a read from cache by setting RV=0, such as --cache ?)Why is this needed: Today, kubectl get only supporting LISTing with RV="" which on large clusters can be slow for users and impact the control plane and etcd in particular.
I did a few tests on large cluster with curl and the difference can be very significant:
Of course this is an extreme example:
I'd be more than happy to provide a PR but I'm not familiar with the codebase so any (even small) guidance would be appreciated. Here is what I think so far but it's very likely I am missing/misunderstanding more than a few things:
cc @wojtek-t because we discussed this on slack earlier this week