Closed shashank-boyapally closed 1 month ago
Hi @paigerube14 @jtaleric just rebased with latest changes.
I added pyshorteners to shorten the buildUrl in the display, the requirements.txt file is not updated since I wanted to update it along fmatch 0.0.6 when released. There would be some function changes in fmatch 0.0.6.
@shashank-boyapally can you let us know when this is ready for folks to test?
Hi @jtaleric, the feature is ready to be tested. I updated the requirements with the latest version of fmatch and pyshorteners. I request kindly to please test and let me know any feedback.
Hi @jtaleric, the feature is ready to be tested. I updated the requirements with the latest version of fmatch and pyshorteners. I request kindly to please test and let me know any feedback.
ack - I just saw conflicts...
It appears that we expect runs = match.get_uuid_by_metadata(metadata)
to return a dict, but it returns a list?
2024-03-29 08:46:45,421 - Orion - DEBUG - file: utils.py - line: 263 - ['ff4e1c2c-6960-4081-bc01-5df2c1e72541', '0e190b62-38ac-4d4d-98f7-baf1032c21f8', '91a7a520-ca19-43b9-9d5f-dca8f3df5518', '29dcba68-af3f-4e71-b6a7-2f0ecfae3977', '8f2f5271-5dff-44d4-b126-1f856e6a8387', 'f4fcc9eb-c6ce-4ac3-9979-63c0c5466302', 'b6af9829-ed2d-4773-bef6-26d106e400a5', 'e88b185a-34eb-4447-be11-54829bec9d39', '7873914e-24c1-4657-8317-0f778df1ec14', 'c525cda2-8712-4ad4-93f1-ec4a5169ce0d', 'f50c2163-7f72-4521-8a79-dbdfc5b1182d', '5ab533f6-0712-460a-9a0c-f132c1909e4c', 'f8a237da-9a19-4421-8a68-b6d10dc85f3a', 'f5ae4cde-e1bc-4e1d-b2da-0f358375988d', '9c6b4e2c-950c-46f4-8631-5123f127a082', 'a6e4a657-73c0-4228-ab17-1e1c177f97b8', '17f93785-c04a-4238-8029-ae5d1e3766d5', 'ea77b881-2d0e-4145-ae9d-ae853c08a07c', '668be9c2-7e6c-4134-8519-086085f54e04', '20d0c41a-be34-49b4-b5e0-e127bc49aa0d', 'ade0d220-3cb1-4902-8c5a-456ccb3f09f8', '0fd65106-dd94-4dd7-9ee3-ac062db2909e', '4d63bb03-b137-44be-bbc5-6690b54d7228', 'aaa28324-67af-45c8-b507-73fbb3063091']
I cleaned up my venv and now I get output... however.. New issue.
orion cmd --config config.yaml --hunter-analyze
2024-03-29 08:55:37,186 - Orion - INFO - file: orion.py - line: 54 - ๐น Starting Orion in command-line mode
2024-03-29 08:55:37,188 - Orion - INFO - file: utils.py - line: 261 - The test aws-416-med-scale-cluster-density-v2 has started
2024-03-29 08:55:37,189 - Matcher - INFO - Executing query against index=perf_scale_ci
2024-03-29 08:55:37,305 - Matcher - INFO - Executing query against index=perf_scale_ci
2024-03-29 08:55:37,352 - Matcher - INFO - Executing query against index=ripsaw-kube-burner*
2024-03-29 08:55:37,398 - Orion - INFO - file: utils.py - line: 123 - Collecting podReadyLatency
2024-03-29 08:55:37,398 - Matcher - INFO - Executing query against index=ripsaw-kube-burner
2024-03-29 08:55:37,433 - Orion - INFO - file: utils.py - line: 123 - Collecting apiserverCPU
2024-03-29 08:55:37,434 - Matcher - INFO - Executing query against index=ripsaw-kube-burner
2024-03-29 08:55:38,170 - Orion - INFO - file: utils.py - line: 123 - Collecting ovnCPU
2024-03-29 08:55:38,170 - Matcher - INFO - Executing query against index=ripsaw-kube-burner
2024-03-29 08:55:39,563 - Orion - INFO - file: utils.py - line: 123 - Collecting etcdCPU
2024-03-29 08:55:39,563 - Matcher - INFO - Executing query against index=ripsaw-kube-burner
2024-03-29 08:55:40,270 - Orion - INFO - file: utils.py - line: 123 - Collecting etcdDisck
2024-03-29 08:55:40,270 - Matcher - INFO - Executing query against index=ripsaw-kube-burner
and with the main branch
orion --config config.yaml --hunter-analyze
2024-03-29 08:57:04,940 - Orion - INFO - The test aws-416-med-scale-cluster-density-v2 has started
2024-03-29 08:57:04,940 - Matcher - INFO - Executing query against index=perf_scale_ci
2024-03-29 08:57:05,080 - Matcher - INFO - Executing query against index=ripsaw-kube-burner*
2024-03-29 08:57:05,118 - Orion - INFO - Collecting podReadyLatency
2024-03-29 08:57:05,119 - Matcher - INFO - Executing query against index=ripsaw-kube-burner
2024-03-29 08:57:05,158 - Orion - INFO - Collecting apiserverCPU
2024-03-29 08:57:05,159 - Matcher - INFO - Executing query against index=ripsaw-kube-burner
2024-03-29 08:57:05,840 - Orion - INFO - Collecting ovnCPU
2024-03-29 08:57:05,841 - Matcher - INFO - Executing query against index=ripsaw-kube-burner
2024-03-29 08:57:07,214 - Orion - INFO - Collecting etcdCPU
2024-03-29 08:57:07,214 - Matcher - INFO - Executing query against index=ripsaw-kube-burner
2024-03-29 08:57:07,953 - Orion - INFO - Collecting etcdDisck
2024-03-29 08:57:07,953 - Matcher - INFO - Executing query against index=ripsaw-kube-burner
time uuid P99 apiserverCPU_cpu_avg ovnCPU_cpu_avg etcdCPU_cpu_avg etcdDisck_duration_avg
------------------------- ------------------------------------ ----- ---------------------- ---------------- ----------------- ------------------------
2024-01-10 14:18:49 +0000 91a7a520-ca19-43b9-9d5f-dca8f3df5518 13000 28.8317 8.03138 15.8919 0.0131896
ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท
+8.2% +10.5%
ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท
2024-02-06 11:05:29 +0000 ff4e1c2c-6960-4081-bc01-5df2c1e72541 13000 31.0332 8.22205 17.8231 0.013406
2024-02-06 12:45:08 +0000 20d0c41a-be34-49b4-b5e0-e127bc49aa0d 13000 31.7655 7.09412 17.3852 0.0147301
2024-02-06 23:39:44 +0000 e88b185a-34eb-4447-be11-54829bec9d39 13000 30.832 7.94952 17.5129 0.0142578
2024-02-07 20:23:43 +0000 7873914e-24c1-4657-8317-0f778df1ec14 13000 31.6765 6.91436 17.641 0.0139926
2024-02-08 12:19:27 +0000 c525cda2-8712-4ad4-93f1-ec4a5169ce0d 13000 31.3236 7.01006 17.8414 0.0136317
2024-02-09 14:42:53 +0000 f4fcc9eb-c6ce-4ac3-9979-63c0c5466302 13000 30.7888 6.96967 17.1353 0.0135467
2024-02-09 14:48:52 +0000 8f2f5271-5dff-44d4-b126-1f856e6a8387 13000 30.7914 8.18303 17.3592 0.0132814
2024-02-12 01:22:23 +0000 0fd65106-dd94-4dd7-9ee3-ac062db2909e 13000 31.42 7.01348 17.8418 0.0135024
I chatted with @shashank-boyapally on Slack. Here is the current thoughts -
listTests
api endpoint to determine which tests are loaded then the user could provide a test name to run Hunter/Algo against, like name=aws-cdv2-fips-120node
or whatever the tests are defined as in the configuration file. I chatted with @shashank-boyapally on Slack. Here is the current thoughts -
- Initial version of Orion w/ Daemon mode will be opinionated. The user will only provide the version which they want to determine if there is a regression. We will use the openshift-payload job to run Hunter against. The payload jobs have the most data and seem like a good starting point.
- Follow on version os Orion w/ Daemon mode could consider implementing a way to accept multiple configs, and the user can choose which config they want to determine if there was change detected. One idea is to have a
listTests
api endpoint to determine which tests are loaded then the user could provide a test name to run Hunter/Algo against, likename=aws-cdv2-fips-120node
or whatever the tests are defined as in the configuration file.
+1 on these ideas. To add on top of it just so that we don't loose track, Hunter also has this feature of having tests divided into groups and then compare between them for regressions. That would also be a good use case to add later when we get to that point.
@jtaleric @shashank-boyapally these changes sound good to me, one question is is there a way to set a timeframe that if we have a job from say January that shows a regression, should we continue to report that issue with every run or should we set a time period (last 2 weeks) or number of runs back (last 10 runs) to limit re-reporting regressions?
@jtaleric @shashank-boyapally these changes sound good to me, one question is is there a way to set a timeframe that if we have a job from say January that shows a regression, should we continue to report that issue with every run or should we set a time period (last 2 weeks) or number of runs back (last 10 runs) to limit re-reporting regressions?
Hunter also has a parameter where we can specify the timestamp field to look at data regressions since a starting point. Ahh, I forgot to mention earlier, it would be a good addition too.
Hi Paige, my opinion on this is we should have the previous regression showing in both cmd mode and daemon, the service consuming the api should be able to filter it out based on the timestamp if needed. My take on having the whole regressions based upon the timeline each version of openshift is tested ~ 6 months. One thing we can do is hunter has a timestamp filter which can ignore previous runs before that if we want to have that functionality.
I'm thinking in terms of orion running in a CI where we might only want to know if the latest/current run is showing a regression from previous runs and fail the job if a regression is detected. I think the timestamping from hunter would be helpful with that as both @vishnuchalla and @shashank-boyapally mentioned. Still a ways out from getting this into a CI though. Just a thought. I'll open an issue to track this option
I added verify certs as an argument so that it can support the new es instances, the PR can be merged once we have fmatch 0.0.7
As per our discussion offline, please break down this PR into atomic commits. Thanks
Type of change
Description
Do not MERGE until fmatch 0.0.6 is released
Firstly, please forgive for changing many things in a single PR. Below are the updates following this PR.
orion daemon
. This is the opinionated version of daemon mode, where the tests are pre-determined withsmall-scale-cluster-density
being default.Get list of options of tests available using
cmd-mode
run it usingorion cmd
daemon-mode
.Related Tickets & Documents
Checklist before requesting a review
Testing