Closed jayvdb closed 8 years ago
I'm not really surprised that the links parser isn't perfect. Contributions are welcome to improve it: otherwise, it'll go on our backlog until one of us has time to take a swing at doing it properly.
And to be clear we explicitly aren't 5988 complaint because Kenneth decided he did not want to handle the case where one rel could have multiple links. There's an effort to add that to the requests-toolbelt though.
This has already been fixed by https://github.com/kennethreitz/requests/pull/2271.
Indeed. I cant see any problems with the output using 2.11.1, with that website.
>>> import requests, pprint
>>> pp = pprint.PrettyPrinter(indent=4)
>>> requests.__version__
'2.11.1'
>>> r = requests.get('http://ws-dl-05.cs.odu.edu/demo/index.php/Tyrion_Lannister')
>>> pp.pprint(r.headers['link'])
('<http://ws-dl-05.cs.odu.edu/demo/index.php/Tyrion_Lannister>; rel="original '
'latest-version",<http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeGate/Tyrion_Lannister>; '
'rel="timegate",<http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeMap/Tyrion_Lannister>; '
'rel="timemap"; type="application/link-format"; from="Mon, 23 Apr 2007 '
'20:26:15 GMT"; until="Fri, 06 Sep 2013 17:19:06 '
'GMT",<http://ws-dl-05.cs.odu.edu/demo/index.php?title=Tyrion_Lannister&oldid=1714>; '
'rel="first memento"; datetime="Mon, 23 Apr 2007 20:26:15 '
'GMT",<http://ws-dl-05.cs.odu.edu/demo/index.php?title=Tyrion_Lannister&oldid=107222>; '
'rel="last memento"; datetime="Fri, 06 Sep 2013 17:19:06 GMT"')
>>> pp.pprint(r.links)
{ 'first memento': { 'datetime': 'Mon, 23 Apr 2007 20:26:15 GMT',
'rel': 'first memento',
'url': 'http://ws-dl-05.cs.odu.edu/demo/index.php?title=Tyrion_Lannister&oldid=1714'},
'last memento': { 'datetime': 'Fri, 06 Sep 2013 17:19:06 GMT',
'rel': 'last memento',
'url': 'http://ws-dl-05.cs.odu.edu/demo/index.php?title=Tyrion_Lannister&oldid=107222'},
'original latest-version': { 'rel': 'original latest-version',
'url': 'http://ws-dl-05.cs.odu.edu/demo/index.php/Tyrion_Lannister'},
'timegate': { 'rel': 'timegate',
'url': 'http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeGate/Tyrion_Lannister'},
'timemap': { 'from': 'Mon, 23 Apr 2007 20:26:15 GMT',
'rel': 'timemap',
'type': 'application/link-format',
'until': 'Fri, 06 Sep 2013 17:19:06 GMT',
'url': 'http://ws-dl-05.cs.odu.edu/demo/index.php/Special:TimeMap/Tyrion_Lannister'}}
@shawnmjones, maybe there are other websites to test with for compliance?
The link header for http://ws-dl-05.cs.odu.edu/demo/index.php/Tyrion_Lannister is not being parsed correctly by
parse_header_links
, and that function doesnt look completely RFC 5988 compliant.Oddly this was the subject of #2250, as it appears the comma's are still causing the problem, so it may be that some corner cases still need to be handled better, and some unit tests written to confirm the problem is actually solved. Happy to help with either.