apache / jena

Apache Jena
https://jena.apache.org/
Apache License 2.0
1.1k stars 647 forks source link

When serializing Dataset to JSONLD the @context is populated only from default graph #2031

Open so-dewy opened 1 year ago

so-dewy commented 1 year ago

Version

4.4.0

What happened?

Hi, right now I'm using Jena version 4.4.0 and experiencing a bug in serializing Dataset with RDFFormat.JSONLD where @context is populated only with properties from default graph. The test.trig file from which Dataset is initialized contains this:

_:object_1 a _:objectClass2 ;
        <http://big.prefix.com#compacted.property> _:compacted.

<urn:test-graph> {
    _:object_2 a _:objectClass2 ;
            <http://big.prefix.com#uncompacted.property> _:un .
}

To serialize the Dataset I use the following code:

public class Test {
    public void test() throws IOException {
        Dataset dataset = createDataset();

        ByteArrayOutputStream out = new ByteArrayOutputStream();
        RDFDataMgr.write(out, dataset, RDFFormat.JSONLD);

        System.out.println(out);
    }

    private Dataset createDataset() throws IOException {
        Dataset dataset = DatasetFactory.create();
        ClassLoader classloader = Thread.currentThread().getContextClassLoader();

        try (InputStream is = classloader.getResourceAsStream("test.trig");) {
            assert is != null;
            RDFDataMgr.read(dataset, is, Lang.TRIG);
        }
        return dataset;
    }
}

This results in the following JSONLD where compacted.property from default graph is in the @context and uncompacted.property is not :

{
  "@graph" : [ {
    "@id" : "_:b0",
    "@type" : "_:b2",
    "compacted.property" : "_:b1"
  }, {
    "@graph" : [ {
      "@id" : "_:b3",
      "@type" : "_:b2",
      "http://big.prefix.com#uncompacted.property" : {
        "@id" : "_:b4"
      }
    } ],
    "@id" : "urn:test-graph"
  } ],
  "@context" : {
    "compacted.property" : {
      "@id" : "http://big.prefix.com#compacted.property",
      "@type" : "@id"
    }
  }
}

I took a look around the code and it seems like the problem is in the method JsonLDWriter.getJsonldContext, particularly this line:

// if no ctx passed via jenaContext, create one in order to have localnames as keys for properties
ctx = createJsonldContext(dataset.getDefaultGraph(), prefixMap, addAllPrefixesToContextFlag(jenaContext)) ;

The context is being created only for default graph. This method seems to also check for user provided context a little bit higher up but I for the life of me couldn't figure out how to provide it :(

I upgraded to the latest version (4.9.0) but it seems like the @context is not being initialized at all, so all the properties are in full form:

{
    "@graph": [
        {
            "@id": "_:b0",
            "http://big.prefix.com#compacted.property": {
                "@id": "_:b1"
            },
            "@type": "_:b2"
        },
        {
            "@id": "urn:test-graph",
            "@graph": [
                {
                    "@id": "_:b3",
                    "http://big.prefix.com#uncompacted.property": {
                        "@id": "_:b4"
                    },
                    "@type": "_:b2"
                }
            ]
        }
    ]
}

I took a look around and it seems that in version 4.4.0 the RDFFormat.JSONLD is set to RDFFormat.JSONLD_COMPACT_PRETTY variant but in the version 4.4.9 it is set to RDFFormat.JSONLD_PRETTY and the RDFFormat.JSONLD_COMPACT_PRETTY is marked as deprecated in favor of JSONLD11. But JSONLD11 seems to not offer compact pretty variant, if I'm blind please point me to the right direction.

Btw, in version 4.9.0 if serializing with deprecated RDFFormat.JSONLD_COMPACT_PRETTY the bug still reproduces the same as in 4.4.0, context is formed only for the default graph.

A bunch of logic in in my application depends on the properties of JSONLD being compacted so I would appreciate any help with this :D

  1. Is there any way to make @context auto-populated in version 4.9.0?
  2. Or perhaps a way I could provide @context myself

I could probably figure out a way to fix the JsonLDWriter.getJsonldContext method and do a pull request. But I don't know if compacted JSONLD is deprecated and it is pointless

I attached a reproducer project archive: jena-reproducer.tar.gz

Are you interested in making a pull request?

Yes

afs commented 1 year ago

The JSON-LD provider for 4.4.0 is Github project jsonld-java which supports JSON-LD 1.0.

In 4.9.0, the default JSON is [titanium-json-ld}(https://github.com/filip26/titanium-json-ld) which provides JSON-LD 1.1. RDFFormat.JSONLD_COMPACT_PRETTY is deprecated because it is backed by jsonld-java.

In Jena5, the JSON-LD 1.0 support is bring removed.

Does Titanium have the features you are looking for?

A contribution for passing the @context to the write (using the RIOT Java Context) would be appreciated. The writer is JsonLD11Writer.

The writer setup for JSON-LD 1.0 was a contribution.

A bunch of logic in in my application depends on the properties of JSON-LD being compacted so I would appreciate any help with this :D

The primary goal of writing RDF in JSON-LD has been to get legal and correct JSON-LD written. Appearance has been secondary because creating idiomatic JSON often is usage dependent - i.e JSON-LD framing. It might be better getting basic JSON-LD out then running it through a toolchain that does framing to give more control in an easier to use fashion.