Closed aolieman closed 10 years ago
Thanks for this detailed issue report, Alex. I am reading (e.g. http://www.websci11.org/fileadmin/websci/Posters/98_paper.pdf) and looking into a solution in HTTPURIDereferencer.
Thanks for expanding the dereferencer functionality to IRIs, Josh :+1: !
What I'm wondering now is how I can use the updated version in Rexster. In my naïvité, I tried a mvn clean install
in the Rexster parent dir, but the output shows that ripple linkeddatasail is not updated and version 1.0 is kept. Would you know of a way I can use the new LinkedDataSail, or should I wait for a new Rexster snapshot?
TinkerPop 2.5.0-SNAPSHOT depends on Ripple 1.1, but since that release was fairly recent and there are some major changes underway in 1.2-SNAPSHOT, it will be a little while before the dereferencer change makes it into TinkerPop. If you can't wait, you can always tweak ripple.version in the blueprints pom.xml after building Ripple locally. Thanks again for helping to improve LDS.
In theory I could have waited, but I was way too exited to try this in Rexster. After building Ripple locally and changing Blueprints' pom.xml, I had to rebuild Blueprints and Rexster to get everything working.
One thing that surprised me though, was that I could now dereference IRIs through the Rexter CLI and the Doghouse, but all my attempts to do the same through Python failed. It turned out to be quite a simple problem: the unicode IRIs I was calling in my script needed to be encoded in Windows-1252 to work. Not a problem per se, but I found it confusing because the output is in UTF-8 (as I would expect).
Hi Josh,
Many of the non-English DBpedias, and likely other LD publishers, use IRIs instead of URIs. As I understand, they implement RFC 3987 correctly by serving these resources at URIs that are the percent-encoded IRIs. Their triples, however, use the unencoded IRIs.
We can only dereference IRIs with LinkedDataSail by percent-escaping them ourselves. But, since the triples in the response use IRIs, it requires a workaround to access them (at least in Gremlin). From a LinkedDataSail in Gremlin user's perspective, the issue could be solved by percent-encoding the IRI internally, only to dereference it, but still use the IRI as the vertex id. By the way, only non-ASCII characters should be encoded. So: an apostrophe ' should not become %27.
Hopefully this example in Gremlin illustrates the problem:
Cheers, Alex