Closed jamsden closed 6 months ago
In the this.parseDOM() function, changing:
var nv = parsetype.nodeValue; if (nv === "Literal"){ frame.datatype = RDFParser.ns.RDF + "XMLLiteral";// (this.buildFrame(frame)).addLiteral(dom) // should work but doesn't frame = this.buildFrame(frame); frame.addLiteral(dom); dig = false; }
to:
var nv = parsetype.nodeValue; if (nv === "Literal"){ frame.datatype = RDFParser.ns.RDF + "XMLLiteral";// (this.buildFrame(frame)).addLiteral(dom) // should work but doesn't frame = this.buildFrame(frame); frame.addLiteral(dom.lastChild.nodeValue); dig = false; }
to get the actual content of the literal node seems to work. Will this might break something else?
I didn't mean to close the issue.
It appears the dataType is incorrect:
{ subject: { uri: 'https://oslclnx2.rtp.raleigh.ibm.com:9443/ccm/oslc/contexts/_pMhMgPsWEeSnQvDHoYok5w/workitems/services.xml', value: 'https://oslclnx2.rtp.raleigh.ibm.com:9443/ccm/oslc/contexts/_pMhMgPsWEeSnQvDHoYok5w/workitems/services.xml' }, predicate: { uri: 'http://purl.org/dc/terms/title', value: 'http://purl.org/dc/terms/title' }, object: { value: 'JKE Banking (Change Management)', lang: '', datatype: [Object] }, why: { uri: 'https://oslclnx2.rtp.raleigh.ibm.com:9443/ccm/oslc/workitems/catalog', value: 'https://oslclnx2.rtp.raleigh.ibm.com:9443/ccm/oslc/workitems/catalog' } },
Should it be:
{ value: 'JKE Banking (Change Management)', lang: undefined, datatype: undefined }
or somehow a string? Or am I doing this query incorrectly:
var sp = this.catalog.statementsMatching(undefined, DCTERMS('title'), 'JKE Banking (Change Management)');
Does the string literal object need to be wrapped in this.catalog.literal? I tried that too, still didn't match, and I noticed that wrapping the string as a literal leaves the datatype undefined as shown above.
I'm making some progress. The ‘addLiteral’ function of the RDFParser frameFactory adds the datatype sym('http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral') for literal nodes while the kb.literal('JKE Banking (Change Management') uses undefined - so they never match. If I force the data type to XMLLiteral, then the match works:
var sp = this.catalog.statementsMatching(undefined, DCTERMS('title'), this.catalog.literal('JKE Banking (Change Management)', undefined, this.catalog.sym('http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral'))));
This doesn't seem to match the documentation which says you should be able to just use a JavaScript string. Is this a bug or does it work as intended, and I have to create these literals with the symbol datatype?
The parsetype="Literal" syntax in RDF/XML is for quoting pieces of embed XML literally. I think you probably just want strings. If you just miss out parsetype="Literal" then you will have the strings you want I suspect.
Unfortunately I don't control the RDF/XML source, its from Rational Team Concert OSLC Service Provider Catalog. So I may have to just deal with RTC's quirk for how it expresses dcterms:title. That's no problem.
However, isn't there still an issue? The RDF/XML source is:
<oslc:serviceProvider> <oslc:ServiceProvider rdf:about="https://oslclnx2.rtp.raleigh.ibm.com:9443/ccm/oslc/contexts/_pMhMgPsWEeSnQvDHoYok5w/workitems/services.xml"> <dcterms:title rdf:parseType="Literal">JKE Banking (Change Management)</dcterms:title> <oslc:details rdf:resource="https://oslclnx2.rtp.raleigh.ibm.com:9443/ccm/process/project-areas/_pMhMgPsWEeSnQvDHoYok5w"/> <jfs_proc:supportLinkDiscoveryViaLinkIndexProvider rdf:parseType="Literal">false</jfs_proc:supportLinkDiscoveryViaLinkIndexProvider> <jfs_proc:supportContributionsToLinkIndexProvider rdf:parseType="Literal">true</jfs_proc:supportContributionsToLinkIndexProvider> <jfs_proc:globalConfigurationAware rdf:parseType="Literal">compatible</jfs_proc:globalConfigurationAware> <jfs_proc:consumerRegistry rdf:resource="https://oslclnx2.rtp.raleigh.ibm.com:9443/ccm/process/project-areas/_pMhMgPsWEeSnQvDHoYok5w/links"/> </oslc:ServiceProvider> </oslc:serviceProvider>
Seems like the value of this property should be LiteralXML, but shouldn't include the property itself, just the value:
JKE Banking (Change Management)
(is this even valid XML?) not
<dcterms:title rdf:parseType="Literal">JKE Banking (Change Management)</dcterms:title>
I think my patch above is incorrect. The this.parseDOM() function for Literal nodes:
var nv = parsetype.nodeValue; if (nv === "Literal"){ frame.datatype = RDFParser.ns.RDF + "XMLLiteral";// (this.buildFrame(frame)).addLiteral(dom) // should work but doesn't frame = this.buildFrame(frame); frame.addLiteral(dom); dig = false; }
should normalize the children of the Literal property (so that === on embedded XML works consistently regardless of ordering), and use an XML serializer to create the value of the node which should be XML source, not parsed DOM. I see similar code in the RDFa parser. If this is correct, I can submit a fix.
Interesting, I have a problem here in May 2016 with Jim's oslc-client being unable to find Service Providers because the statementsMatching method is not finding XMLLiterals that contain the sought CCM Project Name (name only). I wonder if rdflib.js evolved while Jim's OSLC4JS example has not.
My patch for XMLLiterals has not been merged into rdflib.js yet.
On May 30, 2016, at 1:14 AM, Lonnie VanZandt notifications@github.com wrote:
Interesting, I have a problem here in May 2016 with Jim's oslc-client being unable to find Service Providers because the statementsMatching method is not finding XMLLiterals that contain the sought CCM Project Name (name only). I wonder if rdflib.js evolved while Jim's OSLC4JS example has not.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/linkeddata/rdflib.js/issues/75#issuecomment-222412872, or mute the thread https://github.com/notifications/unsubscribe/ABECqgXHfqIHNFxueNv3laRPyknDOaEwks5qGnI3gaJpZM4FpbtE.
Because the find-the-service-provider-by-name method is only looking for a string in what is likely to be a fairly small set of titles, we could refactor the method to retrieve all ?-title-? statements and then use a simple JS or Lodash collection filter to pick out the pattern "(.)${serviceProviderTitle}(.)". That may be good enough versus trying to get the rdflib.catalog to recognize our particular literal value string. What do you think?
The following and the addition of lodash and escapeStringRegex allow the method to find the statement that relates the subject uri to the literal title for the sought serviceProviderTitle.
var haveTitle = this.catalog.statementsMatching(
undefined,
DCTERMS('title'),
undefined );
const regex = new RegExp( ".*?" + escapeStringRegexp( serviceProviderTitle ) + ".*?" );
var sp = _.filter( haveTitle,
(s) =>
{
return s.object.value.match( regex );
}
);
@jamsden probably even easier fix without introducing new dependency:
frame.addLiteral(dom.childNodes)
frame.addLiteral(dom.childNodes) does indeed work.
DOM such as:
JKE Banking (Change Management)
Another paragraph
And another paragraph
JKE Banking (Change Management)
Another paragraph
And another paragraph
So this becomes a one-line code change. I'll implement in my fork, test and create a PULL request. There is about to be a lot of use of rdflib.js in developing OSLC integrations. This defect is a show stopper however since OSLC makes a lot of use of parseType="Literal".This change does not behave nicely in-browser.
The Browser's DomParser handles serialization of NodeLists differently than the library used for NodeJS. In the browser, objects get serialized as "[object NameOfDataType]"
, rather than the contents of the list.
I would propose that the line
frame.addLiteral(dom.childNodes)
Would be better as
//frame.addLiteral(dom.innerHTML);
frame.addLiteral(dom.innerHTML || dom.childNodes);
This both serializes the inner content, as well as preserving it's XML content as requried by parseType='Literal'
. By checking innerHTML
first we use that by default, otherwise assume we are in node and serialize with default childNodes
handler.
I'm a little fuzzy on how nodejs handles this. I assume xmldom
does not have an innerHTML
property.
Issue verified in:
https://forum.solidproject.org/t/errors-parsing-xml-with-rdflib-js-in-the-browser/448
We are facing the same issue. Is it possible to get that fixed or do you have any workarounds? Thanks
@AndreyBespamyatnov
//frame.addLiteral(dom.innerHTML); frame.addLiteral(dom.innerHTML || dom.childNodes);
Is this solving your issue ? Or are there other issues ? I published an rdflib@2.2.34-1 on npm with this patch ? Is this working for you ? Can you test it ?
@AndreyBespamyatnov
//frame.addLiteral(dom.innerHTML); frame.addLiteral(dom.innerHTML || dom.childNodes);
Is this solving your issue ? Or are there other issues ? I published an rdflib@2.2.34-1 on npm with this patch ? Is this working for you ? Can you test it ?
Hi @bourgeoa, let my try a new version and if not I will come back with more information about the issue and some test data, Thank you
@bourgeoa, we had the same issue as this bug in an implementation of the OSLC AM V3 specification using rdflib@2.2.31
and moving to rdflib@2.2.34-1
resolved the issue with no side effects. Thanks for the fix.
@bourgeoa, this fix is not in rdflib@2.2.34-beta
or rdflib@2.2.34
. When will the next rdflib release containing this fix be published to https://www.npmjs.com/package/rdflib?
@paulslauenwhite
@bourgeoa, this fix is not in
rdflib@2.2.34-beta
orrdflib@2.2.34
. When will the next rdflib release containing this fix be published to https://www.npmjs.com/package/rdflib?
merged in rdflib@2.2.35
Thanks @bourgeoa! Confirmed rdflib@2.2.35
contains this fix. Will https://github.com/linkeddata/rdflib.js/releases be updated with the 2.2.35
release?
Given some RDF/XML that contains:
An a query such as: someKb.the(aServiceProvider, DCTERMS('title’));
returns:
instead of the text. Am I missing something of is the dcterms:title being parsed incorrectly?