Closed GoogleCodeExporter closed 9 years ago
The changes are in r1691 -- I'll take a look.
Original comment by eik...@gmail.com
on 7 Aug 2009 at 6:48
It looks like the html5lib usage is failing, when the source isn't valid xhtml:
>>> from rdflib.graph import ConjunctiveGraph
>>> g = ConjunctiveGraph()
>>> g.parse(location='http://oreilly.com/catalog/9781565926288/',
format='rdfa',
lax=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ed/Projects/rdflib/rdflib/graph.py", line 985, in parse
location=location, file=file, data=data, **args)
File "/home/ed/Projects/rdflib/rdflib/graph.py", line 785, in parse
parser.parse(source, self, **args)
File "/home/ed/Projects/rdflib/rdflib/syntax/parsers/rdfa/__init__.py", line 170, in
parse
dom = _try_process_source(stream, options)
File "/home/ed/Projects/rdflib/rdflib/syntax/parsers/rdfa/__init__.py", line 245, in
_try_process_source
parser = html5lib.HTMLParser(tree=treebuilders.getTreeBuilder("dom"))
NameError: global name 'html5lib' is not defined
Incidentally this URL works ok using the RDFa Distiller, which is a service
based on
pyRDFa:
http://www.w3.org/2007/08/pyRdfa/extract?
url=http://oreilly.com/catalog/9781565926288/
Original comment by ed.summers
on 18 Dec 2009 at 7:28
Forgot to mention that I do have html5lib installed ...
Original comment by ed.summers
on 18 Dec 2009 at 7:30
Since the RDFa parser update is in and mostly working I'd like to close this
ticket. Any reason to keep this ticket
open or any tickets we should create from this one?
Original comment by eik...@gmail.com
on 2 Feb 2010 at 9:15
I agree. It's stable, tested and both used and improved on by others now.
Anything odd
popping up would warrant new, specific tickets. Marked as Fixed.
Original comment by lindstr...@gmail.com
on 3 Feb 2010 at 5:41
So should we create tickets for failing RDFa test suite tests? You can run the
test
suite with run_tests.py in trunk...These are the ones that fail, and are most
of the
test failures that remain:
TC #11
TC #92
TC #94
TC #100
TC #101
TC #102
TC #103
TC #114
TC #117
Original comment by ed.summers
on 3 Feb 2010 at 7:28
Please create individual tickets for each one - they fail for various mysterious
reasons, then we have somewhere to discuss.
Original comment by gromgull
on 3 Feb 2010 at 7:37
And another thing, there is always the N3 test trick, where we just moved the
tests
that fail to another folder, i.e. n3 folder and broken_parse_test folder under
test.
Original comment by gromgull
on 3 Feb 2010 at 7:39
I tried to annotate my commits, but didn't get the format right. So commenting
here. In r1766 and r1767 I
moved all the notation3 parsing bits into the notation3 module ridding us of
the non lower case module named
N3Parser. Also moved all the rdfxml parsing into a module with of that name
removing a couple non lower case
module names.
Original comment by eik...@gmail.com
on 3 Feb 2010 at 7:43
See also r1765 for the notation3 related module shuffling update.
Original comment by eik...@gmail.com
on 3 Feb 2010 at 7:45
Original issue reported on code.google.com by
lindstr...@gmail.com
on 6 Aug 2009 at 11:09