drlivingston / kr

Clojure API for RDF and SPARQL - provides consistent access to APIs including Jena and Sesame
56 stars 17 forks source link

Retrieving the language tag with queries #9

Closed nicolasGuillouet closed 10 years ago

nicolasGuillouet commented 10 years ago

Hi Kevin,

I have one question about the edu.ucdenver.ccp.kr.sparql/query.

First I would like to retrieve the language tag when I query my datas. For example, if I have something like :

    <!-- http://purl.org/ontology/mo/musical_work1 -->

    <owl:NamedIndividual rdf:about="&mo;musical_work1">
        <rdf:type rdf:resource="&mo;MusicalWork"/>
        <opus>BWV 824</opus>
        <opus xml:lang="en">BWV 825</opus>
        <key rdf:resource="http://purl.org/NET/c4dm/keys.owl#AFlat"/>
        <movement rdf:resource="&mo;movement1"/>
    </owl:NamedIndividual>

On witch I run : (kr-sparql/query datas '(mo/musical_work1 mo/opus ?/o))

The result I expected is "BWV 825@en" instead of "BWV825". What is the best way for doing this ?

Secondly (perhaps I should open an other issue ?) :

on datas :

    <!-- http://purl.org/NET/c4dm/keys.owl#AFlat -->

    <rdf:Description rdf:about="http://purl.org/NET/c4dm/keys.owl#AFlat">
        <rdfs:label rdf:datatype="&rdfs;Literal">la bémol</rdfs:label>
    </rdf:Description>

When I run : (kr-sparql/query datas '(purl/AFlat rdfs/label ?/o)) it return ({?/o #<TypedValue com.hp.hpl.jena.datatypes.BaseDatatype$TypedValue@6084a2ef>}) I tried to declare one method : (defmethod kr-clj-ify/clj-ify BaseDatatype$TypedValue [kb l] (.lexicalValue l)) without success :-(.

Thanks for your help ;-)

Nicolas

drlivingston commented 10 years ago

Hi Nicolas,

Lets take these one at a time. Regarding the first question, KR can certainly support understanding typed literals that are being input. For example, instead of using "Bob" in an object position you can use ["Bob" "en"] or you could do things like [40 xsd/integer]. You can see example of this in the tests too (let me know I can provide more specific pointers).

I know I had planned to have KR also listen to flags that would allow results from the RDF and SPARQL APIs to selectively return results in that fashion too... I see traces of that code, although it is unclear I have ever completed it. I apologize for leaving that code unfinished. I will try to clean that up, hopefully I can get to it this weekend. It will likely come in one of two formats either a flag that says "box" all the literals like above, regardless of if they are understood or not. And maybe another flag/function that you could define to only selectively "box" some of the literals.

To the second question - and if we can't get this resolved trivially we should open another issue, these are related but this is different (it may be simple it may not). If we can't get to the bottom of it quickly, it might help if we make a gist or something that can be run that reliably demonstrates this. You seem ot have the exact right intuition as to what needs to be done. (and I'm assuming you are (a) sure that that code is loaded, and (b) BaseDatatype is known in that code (or it wouldn't compile I'm sure), still you might want to try fully qualifying the name)

I cannot yet tell if the value you are getting for ?/o is the raw return value, because clj-ify didn't find a matching signature, or if the following method signature matched and the type you are seeing is the output of this:

(defmethod clj-ify com.hp.hpl.jena.graph.Node_Literal [kb l] 
  (.getLiteralValue l))

I'll try to load your triples and see what I see. (Just for reference, how are you loading your triples?)

Could you start a second issue for this?

thanks for your feedback, and I hope to get you unstuck soon, Kevin

drlivingston commented 10 years ago

I think I've figured out a plan for returning typed literals. My intension is to have a flag/function that will force literals to be "boxed" in a vector with their types. The default (nil / :clj) will be off / current behavior.

It will have two other values :string and :clj-type the first will return a two part vector with a string value and then the type/language. The second value :clj-type will do what the default does but also return the type/language as the second value in the vector.

Instead of just a flag for all or nothing, this will be a function you can define (you can force it to do everything one way or the other, or it can give different formats (:clj, :clj-type, :string) depending on the RDF type of the literal.

I think this covers most bases, but I welcome discussion before I start implementing this. (It should be straightforward, but it will require changing both Jena and Sesame implementations and adding tests, I might not be able to test every datatype, but I'll get some decent tests in there.)

Comments from anyone? I'll try to get this in there this (long) weekend.

nicolasGuillouet commented 10 years ago

Hi Kevin,

thanks for you reply ;-).

It seems to be very good : we retrieve the language tag only if we need it and unlike the Jena API it is'nt concatenated to the string value.

Nicolas

ps : I create a new issue for the second question.

drlivingston commented 10 years ago

I've added a flag for controlling this. It's in the current GitHub version, do you have the ability to get the code from there and compile it?

It works like this:

(binding [*literal-mode* :clj-type]

That flag can also take a function to selectively add types, but I haven't tested that yet. If you set it like above it will put types on all your literals. I'll get the function thing working too.

I think actually this patch might fix the other problem you were having too with clj-ify - can you check that too?

thanks, Kevin

drlivingston commented 10 years ago

Function mode is tested now too.

See example of *literal-mode* at the bottom of:

https://github.com/drlivingston/kr/blob/master/kr-core/src/test/clojure/edu/ucdenver/ccp/test/kr/test_rdf.clj

nicolasGuillouet commented 10 years ago

Ok, it is a good and flexible solution ;-).

But now I have a question : can we say that the second element of the array is from a language tag if it is a String ?

Nicolas

drlivingston commented 10 years ago

Yes, that should be the case. It will either be nil, for a plain literal, or if it's a string it should be a language tag. If it's a type it should be processed into a symbol (URI).

I'm going to mark this as closed, but if you find any deviation from this or there are other issues just comment/re-open.

Kevin