j7260a / green-turtle

Automatically exported from code.google.com/p/green-turtle
0 stars 0 forks source link

"flickr_photos:by" etc is treated as a relative URI #15

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Visit e.g. http://www.flickr.com/photos/aigle_dore/8043720291/ in chrome
2. Click the green turtle icon

What is the expected output? What do you see instead?
All predicates with the flickr_photos: prefix are interpreted relative to the 
base URI, making them difficult to work with.  

Compare with the twitter: prefix, which just is passed through (I guess that 
behaviour probably wrong from a pure RDF viewpoint, but it's pretty useful).

The difference just seems to be the underscore in the prefix, which has been 
verified with a non-flickr test page as well.  I don't see anything in the XML 
standard that stops an underscore being used in a prefix.

What version of the product are you using? On what operating system?
1.2.0, Chromium (but also when using the library itself in Firefox).

Please provide any additional information below.

A workaround I used was to just add flickr_photos: to the default mapping:

   this.target.graph.prefixes["flickr_photos"] = "flickr_photos:";

Original issue reported on code.google.com by peter.li...@gmail.com on 4 Nov 2013 at 3:27

GoogleCodeExporter commented 8 years ago
Neither the 'twitter' nor 'flickr_photos' prefix are defined.  As such, the 
spec says:

"Finally, if there is no in-scope mapping for prefix, then the value is not a 
CURIE."

Since it is not a CURIE, it should be interpreted as a relative URI.  As such, 
the flickr_phtotos properties are correctly expanded to a full URI using the 
base URI.  What is not correct are the twitter: prefixed values.  They should 
also be expanded.

As such, the bug is quite the opposite.  I will need to investigate why this is 
happening.

Meanwhile, this will not help you because.  Either way, the values will be 
expanded to full URIs.  If the prefixes were defined, you would get the URI 
that is the result of concatenating the prefix's mapping and the value after 
the colon.  If it is not defined, it should resolve as relative URI reference 
(e.g. how the 'flickr_photos') works now.  The result is you get a full URI in 
the annotation graph.

It looks as though there is some assumption on Flickr of prefixes that aren't 
actually in the referenced RDFa initial context for HTML.

Original comment by a...@milowski.com on 4 Nov 2013 at 5:04

GoogleCodeExporter commented 8 years ago
Thanks, foiled by the RDFa standard again...

I would like to be able to select usability with broken RDFa over standards 
compliance.  Could there be an option to attach() etc to select a 
CURIE-preferential parsing of anything that looks like a qname?

Alternatively, I could work with the prefix hack for the known prefixes that 
people are sloppy with, but that would require being able to pass additional 
mappings to the attach() method, as it seems to me that adding mappings later 
doesn't affect any predicates that have already been parsed.

Would any of those options be possible?

Original comment by peter.li...@gmail.com on 4 Nov 2013 at 5:13

GoogleCodeExporter commented 8 years ago
Well, you aren't being foiled by the RDFa standard.  Flickr is producing broken 
RDFa.

Property names are URIs and they need to be expanded into one.  If they use a 
CURIE, the prefix used needs to be either in the initial context or be defined 
in the document.

Once you have the prefix, you can use the shortened CURIE value in your code.  
For example, this works with the current document:

   document.data.getValues(document.baseURI,"og:title");  => ["Kitten"]

That said, the API can help with this situation.  Currently, you can re-process 
the document with:

   document.data.implementation.attach(document)

but that resets the defined prefixes to the standard initial context.

I could add a feature like:

   document.data.implementation.attach(document, { prefixes: { "twitter" : "http://twitter.com", ...} } );

so you could add to that initial context.   Then you could do:

   document.data.getValues(document.baseURI,"twitter:description");

and probably get what you want (i.e. a shorten way to use long URIs).

Would that be helpful?  I could add that when I sort out this bug.

Original comment by a...@milowski.com on 4 Nov 2013 at 5:23

GoogleCodeExporter commented 8 years ago
Adding prefixes would help a lot, thanks!  I can just list all these fake 
namespaces that I need to work with, though I'll likely just perpetuate the sin 
and use "twitter:" as the URI. (The reason is that I mainly use GreenTurtle to 
locate all metadata about a subject and then export all its predicates onto the 
clipboard as RDF/XML.  It's easier for downstream applications to get the 
regular twitter:creator than something that is more correct RDF but unique to 
this application.)

(And, yes, I know flickr is wrong, it's just that I've previously submitted a 
bug report where it turned out I didn't have the standard straight...)

Original comment by peter.li...@gmail.com on 4 Nov 2013 at 5:40

GoogleCodeExporter commented 8 years ago
No problem.  Filing issues helps find ways to improve the API and 
implementation.  I'll probably have time this Friday to work on this.

Original comment by a...@milowski.com on 4 Nov 2013 at 5:45

GoogleCodeExporter commented 8 years ago

Original comment by a...@milowski.com on 4 Jun 2014 at 10:48