Closed GoogleCodeExporter closed 9 years ago
I think this is a great idea. However we've only just discussed the idea of
giving up backwards compatibility with
versions older then 2.4: http://groups.google.com/group/feedparser-
dev/browse_thread/thread/f25ea27c41a0b196
How backwards compatible are your changes? Which older versions of Python are
still supported?
Original comment by adewale
on 6 Jun 2010 at 9:49
Like most of Python 3 codes, my branch of feedparser isn't compatible with
Python 2.x, majorly because of the change in representation of strings.
Long stories short, str and unicode are renamed to bytes and str accordingly.
You don't use u"" for unicode anymore, but b"" for bytes. "" still represents
str, but it's not a stream of bytes but a literal of characters. (like unicode
in Python 2 was.)
Any other grammar and library changes are so small that it should at least run
on 2.4, if I add some lines to deal with lower versions of Python. But the
string is the major problem.
Original comment by puzz...@gmail.com
on 7 Jun 2010 at 4:51
FYI, Python 3 version of chardet has separate directory in the repository.[1]
BeautifulSoup[2] and NumPy[3] have own scripts for use in py3.
[1] http://code.google.com/p/chardet/source/browse/#hg/src-python3
[2] http://code.google.com/p/beautifulsoup/source/browse/branches/bs4/to3.sh
[3] http://projects.scipy.org/numpy/changeset?
new=7883%40trunk/setup.py&old=7828%40trunk/setup.py#file0
Original comment by puzz...@gmail.com
on 7 Jun 2010 at 3:31
I'll happily accept your patches if you can make them work with Python 2.4 or
even 2.5. It would be even better if you can find a clean abstraction for the
differences between versions.
Original comment by a...@google.com
on 20 Jun 2010 at 2:57
I've modified feedparser so that it runs cleanly in Python 2.4 and up, and can
also be converted by the 2to3 tool and run in both Python 3.0 and 3.1. It
passes all of the unit tests across Python 2.4 through 3.1.
https://github.com/kurtmckee/feedparser/tree/py3
I'm not able to compile Python 2.3 on Ubuntu Maverick, so I haven't tested my
changes on any version of Python older than 2.4, but I avoided changing code as
much as possible, so hopefully absurdly old versions of Python can continue to
run the code if necessary. I haven't yet installed chardet and BeautifulSoup to
test feedparser with those libraries, but I will in the near future.
The only caveat is that, because sgmllib was deprecated in Python 2.6 and is no
longer included in Python 3, it's necessary to copy sgmllib.py from the Python
2 standard library (I used the version included in Python 2.7), run it through
the 2to3 tool, and remove the lines at the top that import, use, and then
delete the warnpy3k module, which also doesn't exist in Python 3.
If anyone doesn't want to use git I can provide a patch file that will apply
cleanly to svn r316. If it's appropriate I can add the sgmllib.py file to the
git branch with the minor changes noted above.
Please let me know in what ways I can improve the branch so that it can be
merged into svn trunk!
Original comment by kurtmckee
on 26 Nov 2010 at 3:30
I think we're ready to attempt the Python 3 merge. Can you generate a patch
that I can apply against HEAD (as of revision 346) so that people can try it
out.
Original comment by adewale
on 22 Dec 2010 at 11:07
I'm attaching the patch against r346 and the additional files I can think of,
but I recommend pulling from the git branch I linked to above in case I drop
the ball and miss a supporting file.
Original comment by kurtmckee
on 23 Dec 2010 at 3:12
Attachments:
As of revision 349 all of the changes for Python 3 support are in.
Note that I'm seeing the following errors when trying to run the tests using
Python 3.1
======================================================================
ERROR: test_000225 (__main__.TestCase)
./tests/wellformed/http/headers_foo.xml: capture arbitrary HTTP header
----------------------------------------------------------------------
Traceback (most recent call last):
File "feedparsertest.py", line 226, in <lambda>
method(self, evalString, feedparser.parse(xmlfile))
File "feedparsertest.py", line 143, in failUnlessEval
if not eval(evalString, env):
File "<string>", line 1, in <module>
KeyError: 'x-foo'
======================================================================
ERROR: test_000850 (__main__.TestCase)
./tests/illformed/encoding/linenoise.xml: unguessable characters
----------------------------------------------------------------------
Traceback (most recent call last):
File "feedparsertest.py", line 143, in failUnlessEval
if not eval(evalString, env):
File "<string>", line 1
bozo and entries[0].summary==u'\xe2\u20ac\u2122\xe2\u20ac\x9d\u0160'
^
SyntaxError: invalid syntax
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "feedparsertest.py", line 226, in <lambda>
method(self, evalString, feedparser.parse(xmlfile))
File "feedparsertest.py", line 151, in failUnlessEval
if not eval(evalString, env):
File "<string>", line 1
bozo and entries[0].summary==u'\xe2\u20ac\u2122\xe2\u20ac\x9d\u0160'
^
SyntaxError: invalid syntax
----------------------------------------------------------------------
Ran 4099 tests in 42.266s
With my configuration 4099 tests are being run. How many are being run with
your configuration?
Original comment by adewale
on 24 Dec 2010 at 2:09
4099 tests run and pass, I simply forgot to attach modified versions of the two
tests that are failing (I wish you could pull directly from the git branches;
the fact that Subversion makes you deal with patch files is heartbreaking!).
headers_foo.xml has to be modified because Python 2 and Python 3 handle HTTP
headers differently. Python 2 normalizes all of the keys corresponding to HTTP
header names to lowercase. Python 3 doesn't. For this reason it's necessary to
modify the testcase to check for either 'x-foo' (for Python 2) or 'X-Foo' (for
Python 3, which is also the actual header that was sent).
linenoise.xml needs to be modified because it's the only test that doesn't
include a space between the '=' and the 'u'. Adding a space makes the u'' -> ''
conversion in feedparsertest.py simpler.
Additionally, convert_to_py3.sh needs to be in the root directory (the same
directory as README-PYTHON3). It fails to convert the files because the paths
are incorrect where it's located now.
And finally, I noticed you added a line in README-PYTHON3 that we're requiring
the 2to3 tool from Python 3.1, and I looked into what kind of problem you might
be seeing. It looks like Python 3.0's 2to3 doesn't include the --no-diffs
command line option, but Python 2.6 and Python 3.1 do. I recommend removing
that line from README-PYTHON3 and instead modifying the sample command line in
README-PYTHON3 as well as the conversion script so that they don't include the
--no-diffs option. I've made and pushed this change to the git branch at github.
Original comment by kurtmckee
on 24 Dec 2010 at 7:20
Attachments:
As of r354 there's a one-line fix that's needed to get all of the tests passing
in Python 3 (and with this patch all of my patches will be tested against
Python 3.0 and 3.1!). Attached is a patch, git branch updated as well.
Original comment by kurtmckee
on 3 Jan 2011 at 12:59
Attachments:
Patch applied in revision 355. Ran the tests against Python 3: Ran 4106 tests
in 41.667s
I'm marking this as fixed. Great work Kurt.
Original comment by adewale
on 4 Jan 2011 at 3:45
Original issue reported on code.google.com by
puzz...@gmail.com
on 30 May 2010 at 11:57