pombreda / feedparser

Automatically exported from code.google.com/p/feedparser
Other
0 stars 0 forks source link

urlparse.urljoin() crashes on unicode strings with non-ascii characters #274

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Download feed with unicode in a URL in the description

What is the expected output? What do you see instead?

sgmllib sends unicode to urlparse which blows up

What version of the product are you using? On what operating system?

Feedparser 4.1, Python 2.6, Ubuntu

Please provide any additional information below.

I'm attaching an anonymized dump of the traceback.

Original issue reported on code.google.com by pete.lin...@gmail.com on 28 Apr 2011 at 5:46

Attachments:

GoogleCodeExporter commented 9 years ago
I tested this against the GitHub repo version and got similar results. I 
attempted to produce a test case, but haven't been able to reproduce, except 
against this private feed. Hopefully the traceback contains enough info.

Original comment by pete.lin...@gmail.com on 28 Apr 2011 at 5:53

GoogleCodeExporter commented 9 years ago
I take it you can't link to -- or download and attach -- the feed for security 
or privacy reasons? If you download the file and try parsing it from your 
harddrive does it still crash? If not, please attach a copy of the headers that 
the server sends; the HTTP headers should be in the `headers` key of the 
dictionary feedparser returns.

I'm also assuming the github repo you're referring to is mine (which mirrors 
svn trunk), but would you download from svn trunk and report back if it's still 
crashing?

https://feedparser.googlecode.com/svn/trunk/feedparser/feedparser.py

Obviously I prefer to have a file to work with, in part because it helps me 
more quickly whittle it down to a test case. That said, I may be able to work 
with just the traceback when I have more time to look at it. Thanks!

Original comment by kurtmckee on 29 Apr 2011 at 7:26

GoogleCodeExporter commented 9 years ago
Thanks for looking into this kurtmckee. As I was swapping version, I noticed a 
few packages installed globally on this system, notably ipython and feedparser. 
I ripped it all out and reinstalled everything in a virtualenv and I can no 
longer reproduce this error.

My guess is that I was testing in ipython and it was not using the virtualenv I 
thought it was. 

This can be closed.

Original comment by pete.lin...@gmail.com on 29 Apr 2011 at 8:43

GoogleCodeExporter commented 9 years ago
Thanks for the quick reply. I'll monitor the Ubuntu feedparser version and 
email the maintainer if a new release isn't eventually made.

Original comment by kurtmckee on 30 Apr 2011 at 7:54