jeffwidman / bitbucket-issue-migration

A small script for migrating repo issues from Bitbucket to GitHub
GNU General Public License v3.0
314 stars 97 forks source link

Intermittent failure in check Github issue import #65

Closed jaraco closed 7 years ago

jaraco commented 8 years ago

I'm running a migration of issues from bb://pypa/setuptools to gh://pypa/setuptools, but twice now I've hit an error:

Completed 80 of 491 issues
Traceback (most recent call last):
  File "migrate.py", line 444, in <module>
    sys.exit(main(options))
  File "migrate.py", line 144, in main
    status_url, gh_auth, headers
  File "migrate.py", line 420, in verify_github_issue_import_finished
    .format(status_url, respo.status_code)
RuntimeError: Failed to check GitHub issue import status url: https://api.github.com/repos/pypa/setuptools/import/issues/441874 due to unexpected HTTP status code: 404

The first time it happened at issue 288, so it's apparently intermittent, perhaps a race condition.

jeffwidman commented 8 years ago

Hmm, that's weird, thanks for the report.

I'll reach out to GitHub and see what they say, as that endpoint theoretically shouldn't be 404'ing since it's the status url.

In the meantime, I'll add a temp fix later tonight of a 1 second delay before checking the status url. Not a great permanent solution as it slows down the imports by ~1 second for each issue when 95% of the time that isn't needed...

jeffwidman commented 8 years ago

I have reproduced this--I'm also having it occur on random issues... sometimes they work fine, sometimes the status url returns a 404.

I verified the status url being checked is the one that GitHub sent, so it appears to be something on GitHub's end... I've submitted a bug report to GitHub including this gist: https://gist.github.com/jeffwidman/a38865b34b9f9f292dd1

Unfortunately, sleeping does not fix the issue--it appears the status url is simply incorrect.

jaraco commented 8 years ago

Any word from Github on this issue? I just attempted a migration again and encountered the issue again.

jeffwidman commented 8 years ago

Basically the engineer came back saying he needed to see the headers (the equivalent of the output of curl -V IIRC). It took me a bit to figure out how to add the logging code, but when I finally got it working, I couldn't reproduce the issue. I tried a bunch of times.

I'll try to cleanup the logging code a little this weekend and push it up as a separate branch for you to try. I'll also try to repro it, but who knows if I'll hit it or not...

jaraco commented 8 years ago

Migrating setuptools, I've hit the issue pretty reliably before 40%, so I'm fairly confident I can hit it again. Thanks for putting together the branch.

jaraco commented 8 years ago

I ran a failed import using pdb tonight and here are the headers:

Completed 167 of 1411 issues
> /Users/jaraco/Dropbox/code/public/bitbucket_issue_migration/migrate.py(516)verify_github_issue_import_finished()
-> if respo.status_code != 200:
(Pdb) respo.headers
{'Access-Control-Expose-Headers': 'ETag, Link, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval', 'X-GitHub-Media-Type': 'github.golden-comet-preview; format=json', 'X-Frame-Options': 'deny', 'Content-Type': 'application/json; charset=utf-8', 'Content-Security-Policy': "default-src 'none'", 'Server': 'GitHub.com', 'Date': 'Sun, 27 Mar 2016 21:17:55 GMT', 'X-RateLimit-Remaining': '4253', 'X-RateLimit-Limit': '5000', 'Status': '404 Not Found', 'Transfer-Encoding': 'chunked', 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload', 'X-Content-Type-Options': 'nosniff', 'Content-Encoding': 'gzip', 'X-GitHub-Request-Id': '498100BB:2E04:E94475B:56F84E02', 'Access-Control-Allow-Origin': '*', 'X-RateLimit-Reset': '1459116680', 'X-XSS-Protection': '1; mode=block'}
(Pdb) respo.url
'https://api.github.com/repos/jaraco/cherrypy-mig-test-9/import/issues/559424'

I suspect there's nothing there that will help.

jaraco commented 8 years ago

And the headers pretty-printed:

(Pdb) pprint.pprint(dict(respo.headers))
{'Access-Control-Allow-Origin': '*',
 'Access-Control-Expose-Headers': 'ETag, Link, X-GitHub-OTP, '
                                  'X-RateLimit-Limit, X-RateLimit-Remaining, '
                                  'X-RateLimit-Reset, X-OAuth-Scopes, '
                                  'X-Accepted-OAuth-Scopes, X-Poll-Interval',
 'Content-Encoding': 'gzip',
 'Content-Security-Policy': "default-src 'none'",
 'Content-Type': 'application/json; charset=utf-8',
 'Date': 'Sun, 27 Mar 2016 21:17:55 GMT',
 'Server': 'GitHub.com',
 'Status': '404 Not Found',
 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload',
 'Transfer-Encoding': 'chunked',
 'X-Content-Type-Options': 'nosniff',
 'X-Frame-Options': 'deny',
 'X-GitHub-Media-Type': 'github.golden-comet-preview; format=json',
 'X-GitHub-Request-Id': '498100BB:2E04:E94475B:56F84E02',
 'X-RateLimit-Limit': '5000',
 'X-RateLimit-Remaining': '4253',
 'X-RateLimit-Reset': '1459116680',
 'X-XSS-Protection': '1; mode=block'}
jeffwidman commented 8 years ago

@jaraco thanks for putting this together. I apologize for the delay here, I've been rather out-of-sorts with the new baby. I'll try to review stuff this coming week, but don't want ot promise for sure.

jaraco commented 8 years ago

Here's an interesting tidbit. Using the code from the PR, I saw this in one attempted migration:

Imported Issue: https://api.github.com/repos/jaraco/cherrypy-mig-test-14/issues/72
Completed 72 of 1415 issues
404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653417
Completed 73 of 1415 issues
404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653418
Completed 74 of 1415 issues
404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653419
Completed 75 of 1415 issues
404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653420
Completed 76 of 1415 issues

It completed 71 requests in a row, then for four requests in a row, Github returned a 404... then it went on to complete several dozen more without errors. Suggests that something is stalled in Github temporarily.

jaraco commented 8 years ago

Dang. And it seems as if ignoring the 404s isn't sufficient:

Imported Issue: https://api.github.com/repos/jaraco/cherrypy-mig-test-14/issues/342
Completed 342 of 1415 issues
Imported Issue: https://api.github.com/repos/jaraco/cherrypy-mig-test-14/issues/343
Completed 343 of 1415 issues
404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653688
Completed 344 of 1415 issues
404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653689
Completed 345 of 1415 issues
404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653690
Completed 346 of 1415 issues
404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653691
Completed 347 of 1415 issues
404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653692
Completed 348 of 1415 issues
404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653693
Completed 349 of 1415 issues
Imported Issue: https://api.github.com/repos/jaraco/cherrypy-mig-test-14/issues/346
Traceback (most recent call last):
  File "migrate.py", line 566, in <module>
    sys.exit(main(options))
  File "migrate.py", line 205, in main
    assert gh_issue_id == issue['local_id']
AssertionError
jeffwidman commented 8 years ago

Yeah we really need GitHub to look into this. Thanks for doing all this research. I'll email their support again and see what they say. On Apr 29, 2016 5:25 PM, "Jason R. Coombs" notifications@github.com wrote:

Dang. And it seems as if ignoring the 404s isn't sufficient:

Imported Issue: https://api.github.com/repos/jaraco/cherrypy-mig-test-14/issues/342 Completed 342 of 1415 issues Imported Issue: https://api.github.com/repos/jaraco/cherrypy-mig-test-14/issues/343 Completed 343 of 1415 issues 404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653688 Completed 344 of 1415 issues 404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653689 Completed 345 of 1415 issues 404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653690 Completed 346 of 1415 issues 404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653691 Completed 347 of 1415 issues 404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653692 Completed 348 of 1415 issues 404 retrieving status URL https://api.github.com/repos/jaraco/cherrypy-mig-test-14/import/issues/653693 Completed 349 of 1415 issues Imported Issue: https://api.github.com/repos/jaraco/cherrypy-mig-test-14/issues/346 Traceback (most recent call last): File "migrate.py", line 566, in sys.exit(main(options)) File "migrate.py", line 205, in main assert gh_issue_id == issue['local_id'] AssertionError

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/jeffwidman/bitbucket-issue-migration/issues/65#issuecomment-215913870

jaraco commented 8 years ago

I managed to get the CherryPy issue tracker to migrate using https://github.com/jaraco/bitbucket_issue_migration/tree/retry-on-rate-limit, which depends on two pending pull requests but also other changes, which I may consider sending for pull requests, but only if my other suggestions can be accepted.

karlingen commented 8 years ago

I just got the same error..

Imported Issue: https://api.github.com/repos/karlingen/TheRepo/issues/273
Completed 273 of 679 issues
Traceback (most recent call last):
  File "migrate.py", line 490, in <module>
    sys.exit(main(options))
  File "migrate.py", line 190, in main
    status_url, options.gh_auth, headers).json()['issue_url']
  File "migrate.py", line 466, in verify_github_issue_import_finished
    .format(status_url, respo.status_code)
RuntimeError: Failed to check GitHub issue import status url: https://api.github.com/repos/karlingen/TheRepo/import/issues/724827 due to unexpected HTTP status code: 404
karlingen commented 8 years ago

@jaraco I tried your pull-request as well (#77) but instead got this error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 578, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 351, in _make_request
    self._validate_conn(conn)
  File "/usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 814, in _validate_conn
    conn.connect()
  File "/usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connection.py", line 289, in connect
    ssl_version=resolved_ssl_version)
  File "/usr/local/lib/python3.5/site-packages/requests/packages/urllib3/util/ssl_.py", line 308, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 377, in wrap_socket
    _context=self)
  File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 752, in __init__
    self.do_handshake()
  File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 988, in do_handshake
    self._sslobj.do_handshake()
  File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 633, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLZeroReturnError: TLS/SSL connection has been closed (EOF) (_ssl.c:645)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/requests/adapters.py", line 403, in send
    timeout=timeout
  File "/usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 604, in urlopen
    raise SSLError(e)
requests.packages.urllib3.exceptions.SSLError: TLS/SSL connection has been closed (EOF) (_ssl.c:645)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "migrate.py", line 496, in <module>
    sys.exit(main(options))
  File "migrate.py", line 167, in main
    comments = get_issue_comments(issue['local_id'], bb_url, options.bb_auth)
  File "migrate.py", line 237, in get_issue_comments
    respo = requests.get(url, auth=bb_auth)
  File "/usr/local/lib/python3.5/site-packages/requests/api.py", line 71, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.5/site-packages/requests/api.py", line 57, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.5/site-packages/requests/sessions.py", line 475, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.5/site-packages/requests/sessions.py", line 585, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.5/site-packages/requests/adapters.py", line 477, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: TLS/SSL connection has been closed (EOF) (_ssl.c:645)
jaraco commented 8 years ago

I tried your pull-request as well (#77) but instead got this error:

Did you get the error immediately? Did you get it at number 273? Did you get it at another seemingly arbitrary number?

The SSL error seems to indicate there's something wrong with their servers, or there's something breaking down in the requests logic. Perhaps you would benefit from upgrading requests?

karlingen commented 8 years ago

I got it when I reached issue number 273. Then I re-ran the operation starting from that number and it hasn't stopped since. Seems to be working now :)

jeffwidman commented 7 years ago

For the record, I gave up on this because it didn't seem like Github support was that interested in pursuing this... an engineer responded to my emails, but kept asking me to do all the request debugging etc to the point that I gave up and said screw it. Since it's a race condition, it's intermittent and debugging this/fixing it requires their support.

jeffwidman commented 7 years ago

I'm going to close this as some of the symptoms are fixed by #77.

We can't actually fix the issue as it's internal to GitHub and they don't seem very motivated to track it down and fix it.