pombreda / feedparser

Automatically exported from code.google.com/p/feedparser
Other
0 stars 0 forks source link

ntlm auth problem #267

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. from ntlm import HTTPNtlmAuthHandler
2. passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
3. passman.add_password(None, url, user, password)
4. handler = HTTPNtlmAuthHandler.HTTPNtlmAuthHandler(passman)
5. print feedparser.parse(url, handlers=[handler])

What is the expected output? What do you see instead?
I see error page of IIS (You are not authorized to view this page due to 
invalid authentication headers.) instead of parsed feed

What version of the product are you using? On what operating system?
feedparser-5.0.1 on Ubuntu Lucid 10.04.2 LTS, Python-2.6.5

Please provide any additional information below.
I fix it with attached patch (dropped adding _FeedURLHandler()), but it's hack, 
so please, fix it properly

Original issue reported on code.google.com by yaroslav...@gmail.com on 18 Apr 2011 at 11:18

Attachments:

GoogleCodeExporter commented 9 years ago
I'm not sure what to make of this. The only effect `_FeedURLHandler` will have 
is to move some of Python's default handlers to the end of the list:

 * HTTPDigestAuthHandler
 * HTTPRedirectHandler
 * HTTPDefaultErrorHandler

With `_FeedURLHandler` the ntlm handler will be called before those three, 
while your patch simply puts the ntlm handler at the end. What happens if you 
change the line in your patch to:

    opener = apply(urllib2.build_opener, tuple([_FeedURLHandler()] + handlers))

Does that solve the problem? Either way, I'll probably have to get with the 
ntlm developers and see if they can help me resolve this as I have no access to 
an NTLM server.

Original comment by kurtmckee on 18 Apr 2011 at 8:39

GoogleCodeExporter commented 9 years ago

Original comment by kurtmckee on 18 Apr 2011 at 8:40

GoogleCodeExporter commented 9 years ago
Nope, both original code and new give me opener.handlers
[<feedparser._FeedURLHandler instance at 0x2f12200>, <urllib2.UnknownHandler 
instance at 0x2f121b8>, <urllib2.HTTPHandler instance at 0x2f12368>, 
<urllib2.FTPHandler instance at 0x2f123b0>, <urllib2.FileHandler instance at 
0x2f12248>, <urllib2.HTTPSHandler instance at 0x2f12440>, 
<ntlm.HTTPNtlmAuthHandler.HTTPNtlmAuthHandler instance at 0x27f5200>, 
<urllib2.HTTPErrorProcessor instance at 0x2f123f8>]

Original comment by yaroslav...@gmail.com on 18 Apr 2011 at 9:02

GoogleCodeExporter commented 9 years ago
I Think that _FeedURLHandler got 401 Unauthorized error and handle it, but it 
need to be handled in HTTPNtlmAuthHandler, so it need to be before 
_FeedURLHandler, but it's on the left of "+", so I didn't understand...

Original comment by yaroslav...@gmail.com on 18 Apr 2011 at 9:09

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
@4: That's why I'm confused, too.

@5: `apply()` doesn't support that call signature. It has to be a sequence.

@6: All of the default handlers are implicitly included at the beginning of the 
handler list unless they're included in the `build_opener()` argument list 
(either explicitly or as a parent to a subclass).

Original comment by kurtmckee on 18 Apr 2011 at 9:30

GoogleCodeExporter commented 9 years ago
I've tried to change line to
opener = apply(urllib2.build_opener, [handlers[0], _FeedURLHandler()])
but it doesn't work, too

Original comment by yaroslav...@gmail.com on 18 Apr 2011 at 9:34

GoogleCodeExporter commented 9 years ago
I've opened a ticket with the python-ntlm project asking for assistance:

https://code.google.com/p/python-ntlm/issues/detail?id=25

Original comment by kurtmckee on 18 Apr 2011 at 9:42

GoogleCodeExporter commented 9 years ago
Okay, I've reached a solution. The feed will need to be requested before being 
passed to feedparser.

Original comment by kurtmckee on 19 Nov 2012 at 5:04

GoogleCodeExporter commented 9 years ago
Hi, I'n the new guy on the ntlm project, trying to clean up as many outstanding 
problems as possible.  I've discovered an incompatibility between ntlm and 
urllib2.  The NTLM protocol only authenticates TCP sessions, so all negotiation 
must be done with "Connection: Keep-Alive" in effect.  The urllib2 library, 
OTOH, forces the use of "Connection: Close".  To get around this, ntlm will 
take over several aspects of the urllib2 open methods; unfortunately this 
currently requires ntlm's hard-coding HTTPConnection as the connection manager. 
 I've submitted a proposal to the urllib2 project at bugs.python.org that 
suggests a work-around.

Anyway, this sounds like it may be related, but I don't know enough about 
feedburner to be sure.  Can someone help me figure this out?  Thanks.

Original comment by samw...@gmail.com on 19 Feb 2013 at 1:06