ropensci / rdflib

:package: High level wrapper around the redland package for common rdf applications
https://docs.ropensci.org/rdflib
Other
57 stars 9 forks source link

Error message reading an HTML file after using rdflib::rdf_query() #48

Closed maelle closed 1 year ago

maelle commented 1 year ago

Coming here via https://github.com/eblondel/zen4R/issues/141 cc @oggioniale

rdflib::rdf_query() somehow influences a later call to xml2::read_html(), see below (in a clean session). Not using {reprex} because otherwise the message isn't in the output.

> xml2::read_html("<html><body><nav>bla</nav></body></html>")
{html_document}
<html>
[1] <body><nav>bla</nav></body>
> 
> doc <- system.file("extdata", "dc.rdf", package="redland")
> 
> sparql <-
+ 'PREFIX dc: <http://purl.org/dc/elements/1.1/>
+  SELECT ?a ?c
+  WHERE { ?a dc:creator ?c . }'
> 
> rdf <- rdflib::rdf_parse(doc)
> rdflib::rdf_query(rdf, sparql)
# A tibble: 1 × 2
  a                      c           
  <chr>                  <chr>       
1 http://www.dajobe.org/ Dave Beckett
> xml2::read_html("<html><body><nav>bla</nav></body></html>")
librdf error - HTML parser error: Tag nav invalid
{html_document}
<html>
[1] <body><nav>bla</nav></body>

So "librdf error - HTML parser error: Tag nav invalid" is the surprising part.

maelle commented 1 year ago

Probably a redland problem https://github.com/ropensci/redland-bindings/issues/99