Closed brancz closed 6 years ago
+1
:+1: This will be very useful, thanks!
brancz, I've set this project up in Packet. Thank you!
Thanks! I've scheduled a call to talk to @brancz
@brancz is catching up with other tasks so I am trying to get this going. I want to deploy the k8s cluster using https://github.com/crosscloudci/cross-cloud and try to add some CI tests which will then use to open a PR against https://github.com/crosscloudci/crosscloudci
I am krasi on #prometheus-dev or let me know where I can ping someone to get access.
@taylorwaggoner could you please connect him about providing these resources.
@krasi-georgiev - if you could please provide your email address, I will send you an invite to Packet. Thanks!
@krasi-georgiev - I sent you an invitation from Packet to the Prometheus 2.0 project. Thanks!
got it thanks will try the CNCF k8s deployment tomorrow.
@taylorwaggoner first attempt to dpeloy the k8s cluster using https://github.com/crosscloudci/cross-cloud , and it is failing as it seems that we also need an account with https://dnsimple.com/ for the name resolutions.
-e PACKET_AUTH_TOKEN=secret
-e TF_VAR_packet_project_id=secret
-e DNSIMPLE_TOKEN=secret
-e DNSIMPLE_ACCOUNT=secret
is this something that can also be provided by the CNCF?
I also opened an issue to see if this can be avoided or use another provider that offers free account for FOSS
you can ignore this request as it seems that have moved away from DNSIMPLE and the README is out of date.
Any report for the experiment?
Due to some circumstances (CoreOS acquisition) the actual execution of this experiment has been postponed for a bit, but the majority of automation is done. When we're actually done, we will publish all results and announce it publicly. Sorry for the delay.
@brancz Thanks for quick reply. Looking forward to the final result.
@brancz Is there any update for this scalability experiment? Thanks a lot.
@pengjiang80 things went out of hand and we never ended up actually running this exact experiment, but within Red Hat we did similar experiments and documented our findings in the Prometheus capacity planning document: https://docs.openshift.com/container-platform/3.11/scaling_performance/scaling_cluster_monitoring.html#cluster-monitoring-capacity-planning
@brancz Thanks for the information.
I've torn down the project associated with this request, as the task was completed.
First Name
Frederic
Last Name
Branczyk
Email
frederic.branczyk@coreos.com
Company/Organization
CoreOS // Prometheus team member
Job Title
Software Engineer
Project Title
Prometheus
Briefly describe the project
Prometheus is an open source monitoring solution and the 2nd project to join the CNCF. https://prometheus.io/
Which members of the CNCF community and/or end-users would benefit from your work?
Prometheus and Kubernetes users
Is the code that you’re going to run 100% open source? If so, what is the URL or URLs where it is located?
Yes 100% open source. Multiple projects under the Prometheus organization are intended to be deployed on a Kubernetes cluster across the nodes.
What kind of machines and how many do you expect to use (see: https://www.packet.net/bare-metal/)?
We would like to test for up to 1000 Kubernetes nodes (maybe the type 0 machines), with two couple of high memory nodes (type 2) to run Prometheus 2.0 on.
What OS and networking are you planning to use (see: https://help.packet.net/technical/infrastructure/supported-operating-systems)?
Container linux by CoreOS
Please state your contributions to the open source community and any other relevant initiatives
Our primary goal of this experiment is to answer these questions:
More exact details of the experiment we are intending to execute can be found in this spreadsheet: https://docs.google.com/spreadsheets/d/1PazPI-ftZrONhmrXZtBdUuYqk4b4JodPpdL2E8tKt7o/edit?usp=sharing
How will this testing advance cloud native computing (specifically containerization, orchestration, microservices or some combination).
This will bring clarity about the resource requirements that Prometheus 2.0 has for a given number of collected samples per second and more specifically give recommendations for running on Kubernetes. Resource planning for Prometheus is a common pain of operating Prometheus and as of today there is no guidance on this for users. We are going to publish the results once the experiments are completed.
Any other relevant details we should know about while preparing the infrastructure?
In a meeting between the Prometheus team and Chris Aniszczyk from the CNCF we were told, that a Kubernetes cluster could be provisioned for us, which would obviously be preferred if we don't have to take care of that setup (in which case choose the OS you prefer if container linux is not an already automated option, but the meltdown vulnerability is already fixed on all update channels :wink: ). We would be happy with having you provision Kubernetes on the nodes.
/cc @caniszczyk @gouthamve @brian-brazil @juliusv @superq @tomwilkie @fabxc