Open haojin2 opened 6 years ago
Suggested labels: CI, Bug
@lebeg @kellensunderland @larroy
Looks like our Jenkins master is having some issues. Working on a quick fix.
I believe we've addressed the root cause of the issue (a volume low on space). I'm monitoring builds to make sure they're working properly. You may see rebuilds being retriggered for the next 1h or so. I'll post back if I think the issue is resolved.
After monitoring for a few hours I think the CI is back to normal. Please feel free to ping if you see any other issues.
Seems like it has been normal for a while, closing this.
Hello @KellenSunderland this issue comes again for my PR. Can you take a look?
Re-opening the issue for tracking purpose, @marcoabreu @lebeg @larroy Can you guys please take a look at this issue?
We had a look. Will require changes to the GitHub plugging in Jenkins. @jlcontreras
I believe this is due to the github plugin sometimes not being able to resolve the repo's url, as we get this in the logs:
[Set GitHub commit status (universal)] SUCCESS on repos [] (sha:90a0bd0) with context:ci/jenkins/mxnet-validation/windows-gpu
instead of the usual
[Set GitHub commit status (universal)] SUCCESS on repos [GHRepository@4c52e2d2[description=Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more,homepage=https://mxnet.apache.org,name=incubator-mxnet,fork=false,size=50318,milestones={},language=C++,commits={},source=<null>,parent=<null>,responseHeaderFields={null=[HTTP/1.1 200 OK], Access-Control-Allow-Origin=[*], Access-Control-Expose-Headers=[ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type], Cache-Control=[private, max-age=60, s-maxage=60], Content-Encoding=[gzip], Content-Security-Policy=[default-src 'none'], Content-Type=[application/json; charset=utf-8], Date=[Tue, 11 Dec 2018 03:32:57 GMT], ETag=[W/"764bedf31b7dffddb84955393efdd67a"], Last-Modified=[Tue, 11 Dec 2018 03:21:56 GMT], OkHttp-Received-Millis=[1544499177176], OkHttp-Response-Source=[CACHE 200], OkHttp-Selected-Protocol=[http/1.1], OkHttp-Sent-Millis=[1544499176983], Referrer-Policy=[origin-when-cross-origin, strict-origin-when-cross-origin], Server=[GitHub.com], Status=[200 OK], Strict-Transport-Security=[max-age=31536000; includeSubdomains; preload], Transfer-Encoding=[chunked], Vary=[Accept, Authorization, Cookie, X-GitHub-OTP, Accept-Encoding], X-Accepted-OAuth-Scopes=[repo], X-Content-Type-Options=[nosniff], X-Frame-Options=[deny], X-GitHub-Media-Type=[github.v3; format=json], X-GitHub-Request-Id=[A5E6:0A07:CD38A9:10E7E9B:5C0F2FE8], X-OAuth-Scopes=[repo:status], X-RateLimit-Limit=[5000], X-RateLimit-Remaining=[4667], X-RateLimit-Reset=[1544499459], X-XSS-Protection=[1; mode=block]},url=https://api.github.com/repos/apache/incubator-mxnet,id=34864402]] (sha:229b8fb) with context:ci/jenkins/mxnet-validation/unix-gpu
Setting commit status on GitHub for https://github.com/apache/incubator-mxnet/commit/229b8fb732fdbdc96f2dec27815b332d6b856dda
I'm currently testing this modification to the plugin to add a retry mechanism, will post updates
https://github.com/jenkinsci/github-plugin/compare/stable-1.29.x...jlcontreras:stable-1.29.x
@jlcontreras Were you able to push a fix on the Jenkinsci repo for this ?
Please feel free to close the issue if this is fixed. I believe the right place to track this issue would be the jenkinsci repository, rather than MXNet repository. WDYT ?
Makes sense to move the discussion there, we are still unsure about how to fix it.
@marcoabreu Can you close this issue if this does not happen anymore ? Or if the issue has been moved to JenkinsCI Git Repo, can you close this and backlink this issue there ?
It still happens all the time. Please consider use more reliable DNS.
For #11631, the build is passing: http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11631/8/pipeline, but the status of the PR is not updated.