alexmilowski / green-turtle

An RDFa 1.1. implementation for browsers.
MIT License
56 stars 20 forks source link

Parsing external content into a graph #6

Open betehess opened 9 years ago

betehess commented 9 years ago

My goal is to parse some external content into a graph, and to have some control over the graph implementation being used.

So I have tried the following in the Chrome console, from the test harness:

var input = '<html><head><title>titleTest</title></head><body> <div vocab="http://schema.org/" typeof="Invoice">  <h1 property="description">January 2015 Visa</h1>  <link property="url" href="http://acmebank.com/invoice.pdf" />Invoice PDF  <div property="broker" itemscope typeof="http://schema.org/BankOrCreditUnion">    <b property="name">ACME Bank</b>  </div>  <span property="accountId">xxxx-xxxx-xxxx-1234</span>  <div property="customer" typeof="http://schema.org/Person">    <b property="name">Jane Doe</b>  </div>  <span property="paymentDue">2015-01-30</span>  <div property="minimumPaymentDue" typeof="http://schema.org/PriceSpecification">    <span property="price">15.00</span>    <span property="priceCurrency">USD</span>  </div>  <div property="totalPaymentDue" typeof="http://schema.org/PriceSpecification">    <span property="price">200.00</span>    <span property="priceCurrency">USD</span>  </div>  <meta property="billingPeriod" content="2014-12-21/P30D" />starts:2014-12-21 30 days  <span property="paymentStatus">payment due</span></div> </body></html>'
undefined

var parser=new DOMParser();
undefined

var foo = parser.parseFromString(input, 'text/html')
undefined

GreenTurtle.attach(foo)
Uncaught Bad URI value, no scheme:

GreenTurtle.attach(foo, {baseURI: 'http://example.com'})
Uncaught Bad URI value, no scheme:

I have a few additional questions:

betehess commented 9 years ago

I actually tried again today and it worked this time. I must have made some mistake yesterday. Sorry for the confusion.

betehess commented 9 years ago

My bad. There is an issue. I had replaced line 606 of RDFa.js and I had forgotten about it:

var base = this.parseURI('http://example.com');

That's where the issue was.

betehess commented 9 years ago

There is a similar issue in RDFaProcessor.prototype.process. When it gets called with a node resulting from a call to DOMParser.parseFromString, then node.baseURI is null and the subsequent call to removeHash blows up when trying to find the "#".

This should look into options.baseURI instead.