...specifically, when discovering webmention endpoints and crawling sites during OPD.
we occasionally see original post links that require cookies. the usual symptom is a redirect loop: we get redirected to a URL that sets the cookie, but we don't pass it back in our next fetch, so we then get the same redirect. example log:
Webmention from https://brid-gy.appspot.com/like/twitter/edtechdev/617723808712105984/1058806590 to http://www.tandfonline.com/doi/full/10.1080/10494820.2015.1060504#.VZlRpHUVhBc
Sending...
Starting new HTTP connection (1): www.tandfonline.com
"GET /doi/full/10.1080/10494820.2015.1060504 HTTP/1.1" 302 None
"GET /doi/full/10.1080/10494820.2015.1060504?cookieSet=1 HTTP/1.1" 302 None
"GET /doi/full/10.1080/10494820.2015.1060504 HTTP/1.1" 302 None
"GET /doi/full/10.1080/10494820.2015.1060504?cookieSet=1 HTTP/1.1" 302 None
"GET /doi/full/10.1080/10494820.2015.1060504 HTTP/1.1" 302 None
"GET /doi/full/10.1080/10494820.2015.1060504?cookieSet=1 HTTP/1.1" 302 None
...
Traceback (most recent call last):
File "/base/data/home/apps/s~brid-gy/3.385505439646824075/tasks.py", line 478, in do_send_webmentions
if not mention.send(timeout=999, headers=util.USER_AGENT_HEADER):
File "/base/data/home/apps/s~brid-gy/3.385505439646824075/local/lib/python2.7/site-packages/webmentiontools/send.py", line 24, in send
self._discoverEndpoint()
File "/base/data/home/apps/s~brid-gy/3.385505439646824075/local/lib/python2.7/site-packages/webmentiontools/send.py", line 30, in _discoverEndpoint
r = requests.get(self.target_url, verify=False, **self.requests_kwargs)
File "/base/data/home/apps/s~brid-gy/3.385505439646824075/local/lib/python2.7/site-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
...
raise TooManyRedirects('Exceeded %s redirects.' % self.max_redirects)
TooManyRedirects: Exceeded 30 redirects.
...specifically, when discovering webmention endpoints and crawling sites during OPD.
we occasionally see original post links that require cookies. the usual symptom is a redirect loop: we get redirected to a URL that sets the cookie, but we don't pass it back in our next fetch, so we then get the same redirect. example log:
looks like the easy fix is to add a requests Session object.