layer5io / meshery-smp-action

GitHub Action for pipelining microservices and Kubernetes performance testing with Meshery
https://layer5.io/projects/nighthawk
Apache License 2.0
27 stars 21 forks source link

Automating initialization of on-demand self-hosted CNCF CIL runner #39

Closed gyohuangxin closed 2 years ago

gyohuangxin commented 2 years ago

The commit automates below steps:

  1. Create a CNCF CIL machine and register it as a self-hosted runner
  2. Run SMP benchmarks on the self-hosted runner
  3. Stop and remove the CNCF CIL machine and self-hosted runner

Description

This PR fixes #38

Notes for Reviewers

Signed commits

hershd23 commented 2 years ago

Great work till now @gyohuangxin. You've made the workflow quite straightforward and easy to understand. If you need any help in resolving those OS permission issues you mentioned in #38, do let me know

Also mentioning @MarioArriaga92 who had a question in the recent build and release about provisioning and releasing instances during the workflow. This should help Mario!

gyohuangxin commented 2 years ago

@hershd23 Thanks, what you did before helped me a lot in this implementation! The issue blocks me now is how to run minikube start with the user I created. I tried to create a user and switch to the user: https://github.com/gyohuangxin/meshery-smp-action/blob/self-hosted/.github/workflows/configurable-benchmark-test-self-hosted.yaml#L83-#L86 However, the message of minikube told me I was still using root user which is not allowed to run minikube start: https://github.com/gyohuangxin/meshery-smp-action/runs/5357892309?check_suite_focus=true#step:3:76 It seems it's hard to switch another user across github action steps.

hershd23 commented 2 years ago

I tried looking for the solution to that myself wasn't able to find anything. Is it possible to make a different user say smp on machine startup itself and our Github action sessions are made on that user instead of root?

gyohuangxin commented 2 years ago

@hershd23 Yes, thank you for your help, it makes sense. I'm investigating to create the user via userdata script, so that we can use the smp user from startup.

gyohuangxin commented 2 years ago

@hershd23 I used the cloud-config configuration to create user smp and it works, https://github.com/gyohuangxin/meshery-smp-action/blob/self-hosted/.github/workflows/scripts/start-cil-runner.sh#L15 The cloud-config (userdata) is very helpful for us to configure everything from machine stratup and here is the related docs:

You can find the latest result from here: https://github.com/gyohuangxin/meshery-smp-action/runs/5401509632?check_suite_focus=true The machine creation, registration, deletion has been implemented, but still an issue with runing mesheryctl perf, it seems the pods can not be accessed when usingkubernetes platform. Do you have any comments on this? @hershd23 @navendu-pottekkat

Configuring Meshery to access Minikube...
Error getting context: Post "http://localhost:9081/api/system/kubernetes/contexts": dial tcp [::1]:9081: connect: connection refused
Configuration file: load-test.yaml
Endpoint URL: http://192.168.49.2:31126/productpage
Service Mesh: ISTIO
Test Name: istio-fortio-load-test.yamltest
Load Generator: fortio
Running test with test configuration file load-test.yaml
Error: failed to make a request.Get "http://localhost:9081/api/user/performance/profiles?page_size=25&page=0&search=test": dial tcp [::1]:9081: connect: connection refused.
See https://docs.meshery.io/reference/mesheryctl/perf/apply for usage details
pottekkat commented 2 years ago

@gyohuangxin It seems like we are deploying Meshery inside the minikube cluster and there is some networking issue. I would suggest we deploy Meshery in Docker and connect it to the Kubernetes cluster. This is how we run tests on the GitHub runners now.

~I'm not sure how it deployed Meshery on Kubernetes right now. I will go over the action code and get back.~

We can set the platform on the workflow to Docker and it should work.

Does this help?

pottekkat commented 2 years ago

Also, not deploying Meshery on the cluster have the added benefit of producing more accurate results as running Meshery is not interfering with the performance of the cluster to some extent.

gyohuangxin commented 2 years ago

@navendu-pottekkat Thanks, it helps. I'll try to deploy Meshery in Docker.

gyohuangxin commented 2 years ago

@navendu-pottekkat I tried to deploy Meshery in docker, but there was a nil pointer panic in meshery container. https://github.com/gyohuangxin/meshery-smp-action/runs/5462624684?check_suite_focus=true#step:6:1240 Do you have any comments on this?

pottekkat commented 2 years ago

@navendu-pottekkat I tried to deploy Meshery in docker, but there was a nil pointer panic in meshery container. gyohuangxin/meshery-smp-action/runs/5462624684?check_suite_focus=true#step:6:1240 Do you have any comments on this?

@piyushsingariya I think we did fix this bug. A new release of mesheryctl should fix this right?

piyushsingariya commented 2 years ago

@navendu-pottekkat I tried to deploy Meshery in docker, but there was a nil pointer panic in meshery container. gyohuangxin/meshery-smp-action/runs/5462624684?check_suite_focus=true#step:6:1240 Do you have any comments on this?

@piyushsingariya I think we did fix this bug. A new release of mesheryctl should fix this right?

@navendu-pottekkat This isn't an mesheryctl issue, it's from the server. @gyohuangxin can you try running same performance test with local build??

gyohuangxin commented 2 years ago

@navendu-pottekkat @piyushsingariya When the platform is docker and using mesheryctl perf, it seems that the meshery container cannot access the endpoint of service mesh application. https://github.com/gyohuangxin/meshery-smp-action/runs/5462624684?check_suite_focus=true#step:6:1174 Is there any method to make endpoint can be accessed to meshery container?

pottekkat commented 2 years ago

There is no Meshery method. It could be some issue with the networking but we were able to use the same configuration on GitHub runners to successfully run benchmark tests and access the application endpoint. It could be environment specific.

gyohuangxin commented 2 years ago

@navendu-pottekkat Yes, I looked at the code and found the panic caused by failing to get minikube context meshery-meshery-1 | time="2022-03-10T07:48:08Z" level=warning msg="failed to generate in cluster context: " meshery-meshery-1 | time="2022-03-10T07:48:08Z" level=warning msg="failed to find kubernetes context". And I tried the local build manually and it works, so it could be the OS permission issue again.

And regarding the Github runner, I found there were another panic with GitHub runners, https://github.com/layer5io/meshery-smp-action/runs/5493977248?check_suite_focus=true#step:6:1573, it may be another issue we need to fix.

gyohuangxin commented 2 years ago

Hi, there. I'm blocked by it too much time and the next important job (running scheduled benchmarking test on CNCF cluster) shouldn't be blocked. Can we start reviewing this PR and create another issue to track the meshery problem? @hershd23 @navendu-pottekkat @leecalcote

pottekkat commented 2 years ago

Yes @gyohuangxin We can review the workflow and merge it. Could you open a new issue to track the others?

pottekkat commented 2 years ago

@hershd23 Could you also review this PR?

hershd23 commented 2 years ago

Yes will do

gyohuangxin commented 2 years ago

@navendu-pottekkat @hershd23 Thanks, I opened the issue https://github.com/layer5io/meshery-smp-action/issues/40.

hershd23 commented 2 years ago

@navendu-pottekkat @leecalcote I don't seem to have the permissions to merge these changes. Do review them and merge them once you're done with your review

hershd23 commented 2 years ago

@leecalcote @navendu-pottekkat please do review and merge. Looks like I do not have merging permissions here

leecalcote commented 2 years ago

This is really neat.

leecalcote commented 2 years ago

New release available: v0.2.0 - https://github.com/marketplace/actions/performance-testing-with-meshery