Closed bparees closed 7 years ago
Also, I am in favor of splitting those two tests into different jobs. I am more worried about the times our tests are taking than the actual number of the jobs we are running. As an example K8S is running at least 8 presubmits today.
cc: @stevekuznetsov
Also, I am in favor of splitting those two tests into different jobs. I am more worried about the times our tests are taking than the actual number of the jobs we are running.
well again that test normally runs in 8 minutes, it's not taking a huge amount of time.
well again that test normally runs in 8 minutes, it's not taking a huge amount of time.
Doesn't justify shoehorning it into another job. We would get a clearer sign if this was a different job as opposed to waiting 2+ hours. Even the timeout option is not granular enough in this case. Also. short-running jobs > long-running jobs.
Right now the justification is that AWS EC2 charges by the hour, so 8min == 1h. Our costs would more than double if we split things out as I would like to. As we move forward with @csrwng Pod
-based jobs I am confident we can break up into very bite-size things, to the point of each verify step being a job, etc. We're on GCE there which is billed per-minute so our quantization error is much smaller.
The integration job today runs 1 ½ hour on clean runs (I hate that we don't have a graph with all these metrics) which means that if we could break the integration test down to less than an hour, then we would get no difference than today billing-wise.
Yep, we could spend time doing that. We just have been conservative about it in the past for cost reasons and we did not take the time to switch them out after word just due to priorities. If you want to make the switch in aos-cd-jobs, sounds like it should be fine for me. The new job won't be 8min, though, as we will need to rebuild a release, but should be <1h.
IT's more useful to make install_Update faster than it is to split out these jobs, because install_update is the slowest job in the queue.
On Fri, Jul 7, 2017 at 1:38 PM, Steve Kuznetsov notifications@github.com wrote:
Yep, we could spend time doing that. We just have been conservative about it in the past for cost reasons and we did not take the time to switch them out after word just due to priorities. If you want to make the switch in aos-cd-jobs, sounds like it should be fine for me. The new job won't be 8min, though, as we will need to rebuild a release, but should be <1h.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/openshift/origin/issues/15093#issuecomment-313746725, or mute the thread https://github.com/notifications/unsubscribe-auth/ABG_p6UeDBoxsdnjqrIUYfxy-JGVOtEfks5sLm0tgaJpZM4OQ-TA .
Agreed about making install_update faster. Opened https://github.com/openshift/aos-cd-jobs/issues/409, https://github.com/openshift/aos-cd-jobs/issues/408, and https://github.com/openshift/aos-cd-jobs/issues/407
The reason why integration is slow is because we have terrible code running in it. Spawning a separate issue.
David split all this.
As seen here, the integration tests hung and got timed out after 2 hours (this is why we introduced the timeouts):
https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_integration/4213
test-end-to-end-docker.sh took 8 minutes in a clean run. so the problem seems to be there.