dimones / feedparser

Automatically exported from code.google.com/p/feedparser
Other
0 stars 0 forks source link

authentication creds are not used #283

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I'm using feedparser-5.0.1-py2.6 , tried on CentOS and Windows. I'm having 
trouble accessing feeds protected by HTTP Basic Auth using the documented "hard 
way". This may be related to Issue 267, but my case is more basic.

Here's the first failure (please substitute your own known good URL/realm):

>>> import feedparser, urllib2
>>> url = 'https://example.com/news/feed/service'
>>> auth = urllib2.HTTPBasicAuthHandler()
>>> auth.add_password('*** realm ***', 'https://example.com', '*** username 
***', '*** password ***')
>>> f = feedparser.parse(url, handlers=[auth])
>>> f.status
401
>>> f.entries
[]

Authentication failed! I expect status to be 200 and there to be some entries 
in the list. I skimmed the relevant source code and found a custom 
urllib2.Request factory (_build_urllib2_request) along with the custom 
_FeedURLHandler() object. I made some low-level calls and found something 
interesting: I can authenticate successfully if I use one of those two 
customizations, but not both:

>>> def test(request, opener):
...   opener.addheaders = []
...   f = opener.open(request)
...   f.read()
...   result = f.status if hasattr(f, 'status') else 'success'
...   f.close()
...   return result

# Use urllib2 standards
>>> request = urllib2.Request(url)
>>> opener = urllib2.build_opener(auth)
>>> test(request, opener)
'success'

# Use only _FeedURLHandler
>>> request = urllib2.Request(url)
>>> opener = apply(urllib2.build_opener, tuple([auth] + 
[feedparser._FeedURLHandler()]))
>>> test(request, opener)
'success'

# Use only _build_urllib2_request
>>> request = feedparser._build_urllib2_request(url, None, None, None, None, 
None, {})
>>> opener = urllib2.build_opener(auth)
>>> test(request, opener)
'success'

# Use both, as feedparser.parse() does
>>> request = feedparser._build_urllib2_request(url, None, None, None, None, 
None, {})
>>> opener = apply(urllib2.build_opener, tuple([auth] + 
[feedparser._FeedURLHandler()]))
>>> test(request, opener)
401

Is this a bug? Hope to use a released version of your software instead of 
hacking it. Thanks.

Original issue reported on code.google.com by tra...@gmail.com on 7 Jun 2011 at 11:39

GoogleCodeExporter commented 9 years ago
This is some great work, thanks for investigating what's going on! While 
revamping the unit tests I kept a close eye on the test coverage, and I've 
known that several areas -- including the auth code -- desperately needed to be 
tested.

I'll work to get this fixed as soon as I have time.

Original comment by kurtmckee on 13 Jun 2011 at 12:43

GoogleCodeExporter commented 9 years ago
The problem appears to be in _FeedURLHandler, and more specifically its 
inheritance from urllib2.HTTPDigestAuthHandler. HTTPDigestAuthHandler has a 
handler_order of 490, which gives it precedence over 
urllib2.HTTPBasicAuthHandler. Given that _FeedURLHandler is designed to always 
come last (well, second to last, as HTTPErrorProcessor has a handler order of 
1000) and remove any HTTPError exceptions, this isn't a great idea.

The attached patch just manually sets the default value to 500, which fixes the 
issue in my case (but may break the Auth Digest handling).

Original comment by t.dettr...@uq.edu.au on 10 Jan 2012 at 2:11

Attachments:

GoogleCodeExporter commented 9 years ago
As a workaround for this issue, you can simply raise the priority of your 
HTTPBasicAuthHandler to 490:

>>> handler = urllib2.HTTPBasicAuthHandler(passman)
>>> handler.handler_order = 490

This will give it an opportunity to handle the error before _FeedURLHandler.

Original comment by t.dettr...@uq.edu.au on 10 Jan 2012 at 2:15

GoogleCodeExporter commented 9 years ago
Thanks for the patch, I'll review it as soon as I have an opportunity!

Original comment by kurtmckee on 10 Jan 2012 at 8:02

GoogleCodeExporter commented 9 years ago
By the way, the workaround `handler.handler_order = 490` worked great for me; 
we're fixed in production now.

Original comment by tra...@gmail.com on 20 Nov 2012 at 11:09