Open lujiajing1126 opened 1 year ago
In our team we would love to see this option as well. We have played around with the newly introduced -l
option that is currently not released and only available on the main
branch (introduced via #70). However we noticed that the workload discovery seems to be done via the Kubernetes API also in this case.
Our setup contains a central monitoring cluster where we would like to utilize krr to produce recommendations for the different clusters connected to this central monitoring solution. A metric-based workload discovery would allow us not to require Kubernetes API access to every downstream cluster that is reporting metrics to the central solution. As shown above the workload information should be extractable via the stored metrics with reasonable effort.
In our team we would love to see this option as well. We have played around with the newly introduced
-l
option that is currently not released and only available on themain
branch (introduced via #70). However we noticed that the workload discovery seems to be done via the Kubernetes API also in this case.Our setup contains a central monitoring cluster where we would like to utilize krr to produce recommendations for the different clusters connected to this central monitoring solution. A metric-based workload discovery would allow us not to require Kubernetes API access to every downstream cluster that is reporting metrics to the central solution. As shown above the workload information should be extractable via the stored metrics with reasonable effort.
You can probably try my fork branch, https://github.com/lujiajing1126/krr/tree/support-prom-discovery.
We've already run this branch several times to provide resource recommendations for our production cluster. (maybe you need some tweak to the formatter)
Just
$ python krr.py simple -n <namespace> -p <prometheus-url>
is enough.
Hey @lujiajing1126, thank you for your work and pointing us to your fork. The results so far look verify promising. We were able to generate recommendations running against our central monitoring cluster without problems so far.
Are you planning to contribute your changes back to krr? If you need any input or further testing which we can help with feel free to ping me!
Hey @lujiajing1126, thank you for your work and pointing us to your fork. The results so far look verify promising. We were able to generate recommendations running against our central monitoring cluster without problems so far.
Are you planning to contribute your changes back to krr? If you need any input or further testing which we can help with feel free to ping me!
Thanks for your feedback! Sure, I am glad to contribute this feature back to the upstream. Since the branch contains several patches except the solution to this issue,
I would like to split it into several Pull Requests.
But so far I did not receive any code review or comments. It seems maintainers are not actively considering contributions from the community.
@lujiajing1126 hey, we appreciate the contribution and will review soon.
I'm sorry about the delay. We're a small team, so it takes sometimes takes more time than we like.
@lujiajing1126 Hey! I really like your idea and approach, and I am currently refactoring the way KRR is working with prometheus.
But there is one thing that came to my mind about removing the usage of cluster API: we have an auto-discovery feature, that makes life easy for lots of people not making them to search for the link. But we can leave that I think, it will just not work when you don't have access, making you to provide the uri.
But there is one more thing I am thinking about: Will it be possible to get data about CRDs using your approach, WDYT? (https://github.com/robusta-dev/krr/issues/65)
In our team we would love to see this option as well. We have played around with the newly introduced
-l
option that is currently not released and only available on themain
branch (introduced via #70). However we noticed that the workload discovery seems to be done via the Kubernetes API also in this case.Our setup contains a central monitoring cluster where we would like to utilize krr to produce recommendations for the different clusters connected to this central monitoring solution. A metric-based workload discovery would allow us not to require Kubernetes API access to every downstream cluster that is reporting metrics to the central solution. As shown above the workload information should be extractable via the stored metrics with reasonable effort.
@Avi-Robusta you are working with centralized monitoring, could you also join the discussion?
But there is one thing that came to my mind about removing the usage of cluster API: we have an auto-discovery feature, that makes life easy for lots of people not making them to search for the link. But we can leave that I think, it will just not work when you don't have access, making you to provide the uri.
Yes. I think so. We can generally provide two ways of workload discovery,
--prometheus-url
is given.But there is one more thing I am thinking about: Will it be possible to get data about CRDs using your approach, WDYT? (#65)
I am not familiar with ArgoCD. We need some investigation. But I think it is prevalent: advanced Kubernetes workload needs their own controller, e.g. OpenKruise
.
And I would like to contribute my branch back to the upstream. (probably I need some more work to polish the code. I guess I could finish next week)
Again, thanks to your team for the hardworking to bring us this great product!
Yes. I think so. We can generally provide two ways of workload discovery,
- Use Kubernetes API Server if provided, and automatically discover in-cluster Prom-compatible metrics server,
- Fallback to fully Prom-based discovery if only
--prometheus-url
is given.
We agree that making it an optional behaviour is a good choise. Maybe by adding --discovery-method
flag with "api-server"
by default or "prometheus"
as an option. With that approach we will not have to remove current implementation that works fine for mostly everyone
Hey all, just to update we have an updated PR with a new mode where KRR can operate only based on Prometheus data. So this will be supported in the near future. Please help by testing the PR!
Is your feature request related to a problem? Please describe.
In some cases, we want to run the KRR program locally. But for the security consideration, the API server of the Kubernetes cluster cannot be accessed outside of the cluster.
So we can use the Prometheus-based workload discovery if
kube-state-metrics
is installed.Describe the solution you'd like
We can do the workload based discovery with the following steps,
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.