AlainCouthures / xsltforms

XForms to XHTML+Javascript (AJAX) conversion based on a unique XSL transformation. Suitable server-side (PHP) or client-side (Google Chrome, Edge, Internet Explorer, Mozilla FireFox, Opera, Safari) browser treatment where an XSLT 1.0 engine is available
37 stars 19 forks source link

Support for JSON-LD #10

Closed timathom closed 7 years ago

timathom commented 9 years ago

I am looking into XSLTForm's support for JSON (a very nice feature!), and I am particularly interested in using JSON-LD[1]. However, it appears that the "@" character, which is used in JSON-LD in the @context and @id keywords, is causing the json2xml method to break.

Following the "JSON for XForms" paper presented at XML Prague, I would expect JSON names containing characters that are not permitted in XML names to be represented using the exml:fullname attribute. Reading the code (in XFInstance.js.xml), I wasn't able to tell where the validity of XML names was being checked, or how to modify the code in order to handle the "@" character.

[1] http://www.w3.org/TR/json-ld/

timathom commented 9 years ago

After further testing, I guess that handling for non-XML-conformant JSON names has not yet been implemented in XSLTForms. For instance, the example from the XML Prague paper:

{
    "a & b": "A+B",
    "déjà": "already",
    "________": "underscores"
}

throws an xforms-link-exception. (I don't see why déjà would be an invalid XML name, however, since diacritics are not reserved characters).

Changing the JSON snippet to:

{
    "a_b": "A+B",
    "deja": "already",
    "________": "underscores"
}

Allows it be to loaded as instance data using xf:instance/@src. With the diacritics on déjà, an xforms-link-exception is thrown. Placing the JSON snippet directly within an xf:instance and including the diacritics on déjà does not throw an error (although "a & b" still does).

timathom commented 9 years ago

I added a simple regex test to the json2xml method in XFInstance.js.xml (line 442), which seems to work as a quick fix (though does not cover edge cases):

var regex = /^XML|[^\u00C0-\u1FFF\u2C00-\uD7FF\w|:|.]+/i;
if (name !== "" && (name === "________" || name.match(regex) !== null)) {       
    fullname = " exml:fullname=\"" + XsltForms_browser.escape(name) + "\"";
    name = "________";
}

One last observation/question: the character encoding of diacritics is not being rendered correctly when the JSON instance is loaded using xf:instance/@src. For example, the key déjà is rendered as something like dj. This issue disappears when the req.overrideMimeType argument on line 180 of XFInstance.js.xml is changed from charset=x-user-defined to charset=utf-8. I'm not sure whether changing this value creates other problems, however.

AlainCouthures commented 9 years ago

Actually, the regular expression for NCName is best suited for this.

So, I think the following instructions would be more appropriate:

XsltFormsbrowser.json2xmlreg = new RegExp("^[A-Za-z\xC0-\xD6\xD8-\xF6\xF8-\xFF][A-Za-z_\xC0-\xD6\xD8-\xF6\xF8-\xFF-.0-9\xB7]*$"); XsltForms_browser.json2xml = function(name, json, root, inarray) { var fullname = ""; if (name === "____" || name !== "" && !XsltForms_browser.json2xmlreg.test(name)) {

Thank you for your feedback!

timathom commented 9 years ago

Excellent! However, should colons should also be allowed, in order to represent JSON names that are also valid XML QNames? Or would that create potential problems with namespace declarations? Also, what is the best approach for rendering diacritics in JSON names (e.g., changing the req.overrideMimeType value to charset=utf-8)?

For example[1]:

{
  "@context": {
    "ical": "http://www.w3.org/2002/12/cal/ical#",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "ical:dtstart": {
      "@type": "xsd:dateTime"
    }
  },
  "ical:summary": "Lady Gaga Concert",
  "ical:location": "New Orleans Arena, New Orleans, Louisiana, USA",
  "ical:dtstart": "2011-04-09T20:00Z"
}

Using the NCName regex, this is converted to:

<exml:anonymous xmlns="" xmlns:exml="http://www.agencexml.com/exml" xmlns:exsi="http://www.agencexml.com/exi" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <________ exml:fullname="@context">
        <ical xsi:type="xsd:string">http://www.w3.org/2002/12/cal/ical#</ical>
        <xsd xsi:type="xsd:string">http://www.w3.org/2001/XMLSchema#</xsd>
        <________ exml:fullname="ical:dtstart">
            <________ exml:fullname="@type" xsi:type="xsd:string">xsd:dateTime</________>
        </________>
    </________>
    <________ exml:fullname="ical:summary" xsi:type="xsd:string">Lady Gaga Concert</________>
    <________ exml:fullname="ical:location" xsi:type="xsd:string">New Orleans Arena, New Orleans, Louisiana, USA</________>
    <________ exml:fullname="ical:dtstart" xsi:type="xsd:string">2011-04-09T20:00Z</________>
</exml:anonymous>

[1] http://json-ld.org/playground/index.html

AlainCouthures commented 9 years ago

Allowing colons would require to add corresponding namespace declarations.

This cannot be done without interpreting @context which means that XSLTForms should already know that it is dealing with JSON-LD data, probably with the corresponding mediatype value. Interpreting @type would also be very interesting (multiple types for one node could be problematic...).

Having a look at http://json-ld.org/playground/index.html, I wonder if XSLTForms should also propose to first convert the JSON-LD data into one of the 3 specified forms (Expanded, Compacted or Flattened) and allow this for serialization too.

What do you think?

timathom commented 9 years ago

Yes, I guess that adding built-in support for JSON-LD could involve a lot of overhead. JSON-LD Expanded seems to be the preferred serialization for data exchange, whereas the Compacted format might be easier to work with as the content of an XForms instance. The Expanded format also bypasses the need to deal with namespace prefixes, since full URIs are always used.

The jsonld.js library provides a full implementation of the spec.

In general, the problem with editing RDF in XForms would seem to center on the mismatch between the basic data model (subject-predicate-object) and its many possible serializations. It's difficult to reliably use RDF/XML as an XForms instance format, because one cannot always be sure that the data will be serialized in the same way when it is retrieved from a triplestore, for example. JSON-LD is similarly complex (maybe more so), but has well-defined algorithms for converting from one flavor to another.

I've actually been thinking about using HTML fragments with RDFa attributes in my XForms instance data, since RDFa is less brittle than other RDF serializations. Of course, most triplestores don't support RDFa as an export format, so if I fetch RDF data from a triplestore, it still has to be parsed before it can be used in an XForms interface. So far, I've been using Norman Walsh's RDF extension for XML Calabash to handle that part.

timathom commented 8 years ago

JSON-LD can wrap a series of JSON objects in a @graph array to support anonymous or named graph structures. The current EXML serialization in XSLTForms does not preserve the @graph array, however. See the following example with a JSON-LD document followed by the current EXML representation:

JSON-LD

{
    "@context": {
        "geo": "http://www.w3.org/2003/01/geo/wgs84_pos#",
        "void": "http://rdfs.org/ns/void#",
        "foaf": "http://xmlns.com/foaf/0.1/",
        "dcmitype": "http://purl.org/dc/dcmitype/",
        "relations": "http://pelagios.github.io/vocab/relations#",
        "xsd": "http://www.w3.org/2001/XMLSchema#",
        "dcterms": "http://purl.org/dc/terms/",
        "nm": "http://nomisma.org/id/",
        "nmo": "http://nomisma.org/ontology#",
        "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
        "crm": "http://www.cidoc-crm.org/cidoc-crm/",
        "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
        "oa": "http://www.w3.org/ns/oa#",
        "skos": "http://www.w3.org/2004/02/skos/core#",
        "pelagios": "http://pelagios.github.io/vocab/terms#"
    },
    "@graph": [
        {
            "@id": "http://numismatics.org/ocre/id/ric.8.cnp.36",
            "@type": [
                "nmo:TypeSeriesItem",
                "http://www.w3.org/2004/02/skos/core#Concept"
            ],
            "rdf:type": ["http://www.w3.org/2004/02/skos/core#Concept"],
            "skos:prefLabel": [{
                "@value": "RIC VIII Constantinople 36",
                "@language": "en"
            }],
            "skos:definition": [{
                "@value": "RIC VIII Constantinople 36",
                "@language": "en"
            }],
            "dcterms:source": [{"@id": "http://nomisma.org/id/ric"}],
            "nmo:representsObjectType": [{"@id": "http://nomisma.org/id/coin"}],
            "nmo:hasManufacture": [{"@id": "http://nomisma.org/id/struck"}],
            "nmo:hasDenomination": [{"@id": "http://nomisma.org/id/ae3"}],
            "nmo:hasMaterial": [
                {"@id": "http://nomisma.org/id/billon"},
                {"@id": "http://nomisma.org/id/ae"}
            ],
            "nmo:hasAuthority": [{"@id": "http://nomisma.org/id/constantius_ii"}],
            "nmo:hasMint": [{"@id": "http://nomisma.org/id/constantinople"}],
            "nmo:hasRegion": [{"@id": "http://nomisma.org/id/thrace"}],
            "nmo:hasStartDate": [{
                "@value": "0337",
                "@type": "http://www.w3.org/2001/XMLSchema#gYear"
            }],
            "nmo:hasEndDate": [{
                "@value": "0340",
                "@type": "http://www.w3.org/2001/XMLSchema#gYear"
            }],
            "nmo:hasObverse": [{"@id": "http://numismatics.org/ocre/id/ric.8.cnp.36#obverse"}],
            "nmo:hasReverse": [{"@id": "http://numismatics.org/ocre/id/ric.8.cnp.36#reverse"}],
            "void:inDataset": [{"@id": "http://numismatics.org/ocre/"}]
        },
        {
            "@id": "http://numismatics.org/ocre/id/ric.8.cnp.36#obverse",
            "nmo:hasLegend": [{"@value": "FL MAX THEO-DORAE AVG"}],
            "dcterms:description": [{
                "@value": "Bust of Theodora, hair elaborately dressed, wearing plain mantle and necklace, right",
                "@language": "en"
            }],
            "nmo:hasPortrait": [{"@id": "http://nomisma.org/id/theodora"}]
        },
        {
            "@id": "http://numismatics.org/ocre/id/ric.8.cnp.36#reverse",
            "nmo:hasLegend": [{"@value": "PIETAS - ROMANAâ\u20ac¢"}],
            "dcterms:description": [{
                "@value": "Pietas, draped, standing front, head right, carrying an infant at her breast in right hand",
                "@language": "en"
            }],
            "nmo:hasPortrait": [{"@id": "http://collection.britishmuseum.org/id/person-institution/59792"}]
        }
    ]
}

EXML

<exml:anonymous xmlns:exml="http://www.agencexml.com/exml"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:exsi="http://www.agencexml.com/exi" xmlns="">
    <________ exml:fullname="@context">
        <geo xsi:type="xsd:string"
            >http://www.w3.org/2003/01/geo/wgs84_pos#</geo>
        <void xsi:type="xsd:string">http://rdfs.org/ns/void#</void>
        <foaf xsi:type="xsd:string">http://xmlns.com/foaf/0.1/</foaf>
        <dcmitype xsi:type="xsd:string">http://purl.org/dc/dcmitype/</dcmitype>
        <relations xsi:type="xsd:string"
            >http://pelagios.github.io/vocab/relations#</relations>
        <xsd xsi:type="xsd:string">http://www.w3.org/2001/XMLSchema#</xsd>
        <dcterms xsi:type="xsd:string">http://purl.org/dc/terms/</dcterms>
        <nm xsi:type="xsd:string">http://nomisma.org/id/</nm>
        <nmo xsi:type="xsd:string">http://nomisma.org/ontology#</nmo>
        <rdf xsi:type="xsd:string"
            >http://www.w3.org/1999/02/22-rdf-syntax-ns#</rdf>
        <crm xsi:type="xsd:string">http://www.cidoc-crm.org/cidoc-crm/</crm>
        <rdfs xsi:type="xsd:string">http://www.w3.org/2000/01/rdf-schema#</rdfs>
        <oa xsi:type="xsd:string">http://www.w3.org/ns/oa#</oa>
        <skos xsi:type="xsd:string">http://www.w3.org/2004/02/skos/core#</skos>
        <pelagios xsi:type="xsd:string"
            >http://pelagios.github.io/vocab/terms#</pelagios>
    </________>
    <________ exml:fullname="________" exsi:maxOccurs="unbounded">
        <________ exml:fullname="@id" xsi:type="xsd:string"
            >http://numismatics.org/ocre/id/ric.8.cnp.36</________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded"
            xsi:type="xsd:string">nmo:TypeSeriesItem</________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded"
            xsi:type="xsd:string"
            >http://www.w3.org/2004/02/skos/core#Concept</________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded"
            xsi:type="xsd:string"
            >http://www.w3.org/2004/02/skos/core#Concept</________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@value" xsi:type="xsd:string">RIC VIII
                Constantinople 36</________>
            <________ exml:fullname="@language" xsi:type="xsd:string"
                >en</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@value" xsi:type="xsd:string">RIC VIII
                Constantinople 36</________>
            <________ exml:fullname="@language" xsi:type="xsd:string"
                >en</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://nomisma.org/id/ric</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://nomisma.org/id/coin</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://nomisma.org/id/struck</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://nomisma.org/id/ae3</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://nomisma.org/id/billon</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://nomisma.org/id/ae</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://nomisma.org/id/constantius_ii</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://nomisma.org/id/constantinople</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://nomisma.org/id/thrace</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@value" xsi:type="xsd:string"
                >0337</________>
            <________ exml:fullname="@type" xsi:type="xsd:string"
                >http://www.w3.org/2001/XMLSchema#gYear</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@value" xsi:type="xsd:string"
                >0340</________>
            <________ exml:fullname="@type" xsi:type="xsd:string"
                >http://www.w3.org/2001/XMLSchema#gYear</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://numismatics.org/ocre/id/ric.8.cnp.36#obverse</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://numismatics.org/ocre/id/ric.8.cnp.36#reverse</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://numismatics.org/ocre/</________>
        </________>
    </________>
    <________ exml:fullname="________" exsi:maxOccurs="unbounded">
        <________ exml:fullname="@id" xsi:type="xsd:string"
            >http://numismatics.org/ocre/id/ric.8.cnp.36#obverse</________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@value" xsi:type="xsd:string">FL MAX
                THEO-DORAE AVG</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@value" xsi:type="xsd:string">Bust of
                Theodora, hair elaborately dressed, wearing plain mantle and
                necklace, right</________>
            <________ exml:fullname="@language" xsi:type="xsd:string"
                >en</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://nomisma.org/id/theodora</________>
        </________>
    </________>
    <________ exml:fullname="________" exsi:maxOccurs="unbounded">
        <________ exml:fullname="@id" xsi:type="xsd:string"
            >http://numismatics.org/ocre/id/ric.8.cnp.36#reverse</________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@value" xsi:type="xsd:string">PIETAS -
                ROMANA</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@value" xsi:type="xsd:string">Pietas,
                draped, standing front, head right, carrying an infant at her
                breast in right hand</________>
            <________ exml:fullname="@language" xsi:type="xsd:string"
                >en</________>
        </________>
        <________ exml:fullname="________" exsi:maxOccurs="unbounded">
            <________ exml:fullname="@id" xsi:type="xsd:string"
                >http://collection.britishmuseum.org/id/person-institution/59792</________>
        </________>
    </________>
</exml:anonymous>