filip26 / ld-cli

A Command Line Interface for Linked Data Processing
Apache License 2.0
12 stars 1 forks source link

add `--expand-context` option [was: flatten with --context does not seem to be working (?)] #95

Closed vorburger closed 1 month ago

vorburger commented 1 month ago

This works:

ld-cli flatten --mode=1.1 --input=file:"$PWD"/test/picasso.json

with this picasso.json, which has (x2) @context loading this picasso-context.jsonld.

When I remove (x2) the @context, as I will in https://github.com/enola-dev/enola/pull/796/files (WIP), then this doesn't work, there is no failure, but content is missing:

ld-cli flatten --mode=1.1 --context=file:"$PWD"/test/picasso-context.jsonld --input=file:"$PWD"/test/picasso.json

PS, just FYI: I had used ld-cli for exploring and learning JSON-LD (Thank You for creating it!), and am now moving on to directly using https://github.com/filip26/titanium-json-ld (actually its https://github.com/HASMAC-AS/hasmac-json-ld fork, via https://github.com/eclipse-rdf4j/rdf4j) and it works there. So I don't really need a fix, but just wanted to let you know about this, in case you wanted to look into it.

filip26 commented 1 month ago

Thank you for reporting it, it's probably a bug that needs be confirmed by a test.

filip26 commented 1 month ago

@vorburger please, could you try the latest build (downloadbles are at bottom) ? Does this example address your use-case?

> ./ld-cli flatten --mode=1.1 --context=https://raw.githubusercontent.com/vorburger/enola/d27b161d1ff13eb2d963e32d9aea4ea0fea58ebe/test/picasso-context.jsonld --input=https://raw.githubusercontent.com/enola-dev/enola/447133ceb759cf87df30dbc18e54b8606c904272/test/picasso.json --pretty
vorburger commented 1 month ago

@vorburger please, could you try the latest build (downloadbles are at bottom) ? Does this example address your use-case?

> ./ld-cli flatten --mode=1.1 --context=https://raw.githubusercontent.com/vorburger/enola/d27b161d1ff13eb2d963e32d9aea4ea0fea58ebe/test/picasso-context.jsonld --input=https://raw.githubusercontent.com/enola-dev/enola/447133ceb759cf87df30dbc18e54b8606c904272/test/picasso.json --pretty

@filip26 unfortunately it seems like that it still does not, this (your command) produces (when I run it):

{
    "@graph": [
        {
            "id": "http://example.enola.dev/Picasso",
            "type": "http://example.enola.dev/Artist",
            "homeAddress": {
                "id": "_:b0"
            },
            "location": "Spain",
            "firstName": "Pablo"
        },
        {
            "id": "_:b0",
            "http://example.enola.dev/city": "Barcelona",
            "http://example.enola.dev/street": "31 Art Gallery"
        },
        {
            "id": "http://example.enola.dev/Dalí",
            "type": "http://example.enola.dev/Artist",
            "firstName": [
                "Salvador",
                "Domingo",
                "Felipe",
                "Jacinto"
            ],
            "birthDate": "1904-05-11"
        }
    ],
    "@context": {
        "@version": 1.1,
        "id": "@id",
        "type": "@type",
        "firstName": "http://xmlns.com/foaf/0.1/firstName",
        "birthDate": {
            "@id": "https://schema.org/birthDate",
            "@type": "https://schema.org/Date"
        },
        "location": {
            "@id": "http://www.w3.org/ns/locn#location",
            "@language": "en"
        },
        "homeAddress": {
            "@id": "http://example.enola.dev/homeAddress",
            "@context": {
                "city": "http://example.enola.dev/city",
                "street": "http://example.enola.dev/street"
            }
        }
    }
}

I could be simply misunderstanding something about JSON-LD... so let me try to better explain; originally it was:

$ /home/vorburger/Downloads/ld-cli-0.9.0-ubuntu-latest/ld-cli flatten --mode=1.1 --input=https://raw.githubusercontent.com/enola-dev/enola/447133ceb759cf87df30dbc18e54b8606c904272/test/picasso.json --pretty

where picasso.json contains @context produces the following, which seems correct:

[
    {
        "@id": "http://example.enola.dev/Picasso",
        "@type": [
            "http://example.enola.dev/Artist"
        ],
        "http://example.enola.dev/homeAddress": [
            {
                "@id": "_:b0"
            }
        ],
        "http://www.w3.org/ns/locn#location": [
            {
                "@value": "Spain",
                "@language": "en"
            }
        ],
        "http://xmlns.com/foaf/0.1/firstName": [
            {
                "@value": "Pablo"
            }
        ]
    },
    {
        "@id": "_:b0",
        "http://example.enola.dev/city": [
            {
                "@value": "Barcelona"
            }
        ],
        "http://example.enola.dev/street": [
            {
                "@value": "31 Art Gallery"
            }
        ]
    },
    {
        "@id": "http://example.enola.dev/Dalí",
        "@type": [
            "http://example.enola.dev/Artist"
        ],
        "http://xmlns.com/foaf/0.1/firstName": [
            {
                "@value": "Salvador"
            },
            {
                "@value": "Domingo"
            },
            {
                "@value": "Felipe"
            },
            {
                "@value": "Jacinto"
            }
        ],
        "https://schema.org/birthDate": [
            {
                "@value": "1904-05-11",
                "@type": "https://schema.org/Date"
            }
        ]
    }
]

If we now try with another picasso.json where the @context was removed, and instead of having it inline add the --context to fetch the picasso-context.jsonld from the exact same Git revision as before, then it doesn't seem to work, as it lost the firstName etc. which doesn't seem right, to me, just because you switch from an "inline" to an "external" Context, no?

$  /home/vorburger/Downloads/ld-cli-0.9.0-ubuntu-latest/ld-cli flatten --mode=1.1 --input=https://raw.githubusercontent.com/enola-dev/enola/14fa34d307a5d2811196e4b377d40199eff8ec44/test/picasso.json --context=https://raw.githubusercontent.com/vorburger/enola/447133ceb759cf87df30dbc18e54b8606c904272/test/picasso-context.jsonld --pretty
{
    "@graph": [
        {
            "@id": "http://example.enola.dev/Picasso",
            "@type": "http://example.enola.dev/Artist"
        },
        {
            "@id": "http://example.enola.dev/Dalí",
            "@type": "http://example.enola.dev/Artist"
        }
    ],
    "@context": {
        "@version": 1.1,
        "firstName": "http://xmlns.com/foaf/0.1/firstName",
        "birthDate": {
            "@id": "https://schema.org/birthDate",
            "@type": "https://schema.org/Date"
        },
        "location": {
            "@id": "http://www.w3.org/ns/locn#location",
            "@language": "en"
        },
        "homeAddress": {
            "@id": "http://example.enola.dev/homeAddress",
            "@context": {
                "city": "http://example.enola.dev/city",
                "street": "http://example.enola.dev/street"
            }
        }
    }
}

FYI: I had used ld-cli (Thank You for creating it!) for exploring and learning JSON-LD, and meanwhile in (my own) https://docs.enola.dev/models/example.org/json-ld/ have moved on to directly using https://github.com/filip26/titanium-json-ld (actually its https://github.com/HASMAC-AS/hasmac-json-ld fork, via https://github.com/eclipse-rdf4j/rdf4j) and there it works like this:

$ ./enola rosetta --in="file:test/picasso.json?context=file:test/picasso-context.jsonld" --out="fd:1?mediaType=application/ld+json"
[
    {
        "@id": "_:b0",
        "http://example.enola.dev/city": [
            {
                "@value": "Barcelona"
            }
        ],
        "http://example.enola.dev/street": [
            {
                "@value": "31 Art Gallery"
            }
        ]
    },
    {
        "@id": "http://example.enola.dev/Dalí",
        "@type": [
            "http://example.enola.dev/Artist"
        ],
        "http://xmlns.com/foaf/0.1/firstName": [
            {
                "@value": "Salvador"
            },
            {
                "@value": "Domingo"
            },
            {
                "@value": "Felipe"
            },
            {
                "@value": "Jacinto"
            }
        ],
        "https://schema.org/birthDate": [
            {
                "@value": "1904-05-11",
                "@type": "https://schema.org/Date"
            }
        ]
    },
    {
        "@id": "http://example.enola.dev/Picasso",
        "@type": [
            "http://example.enola.dev/Artist"
        ],
        "http://example.enola.dev/homeAddress": [
            {
                "@id": "_:b0"
            }
        ],
        "http://www.w3.org/ns/locn#location": [
            {
                "@language": "en",
                "@value": "Spain"
            }
        ],
        "http://xmlns.com/foaf/0.1/firstName": [
            {
                "@value": "Pablo"
            }
        ]
    }
]
filip26 commented 1 month ago

@vorburger thank you, the --context option for flatten is used to compact the flattened result. --expand-context options is missing, that's what you need. If you use Titanium directly then you can set an expansion context with JsonLdOptions.setExpandContext().

filip26 commented 1 month ago

@vorburger please try the latest release with

ld-cli flatten --mode=1.1 --expand-context=https://raw.githubusercontent.com/vorburger/enola/d27b161d1ff13eb2d963e32d9aea4ea0fea58ebe/test/picasso-context.jsonld --input=https://raw.githubusercontent.com/enola-dev/enola/14fa34d307a5d2811196e4b377d40199eff8ec44/test/picasso.json --pretty

and feel free to re-open the issue.

vorburger commented 1 month ago

@filip26 yeah that works now; cool, thank you. I've noted the (new) --expand-context= "context IRI to expand the document before flattening" -VS- --context=<context>= "context IRI to compact the flattened document" :

$ ld-cli flatten --help
Usage: ld-cli flatten [-aop] [-b=<base>] [-c=<context>] [-e=<expandContext>]
                      [-i=<input>] [-m=1.0|1.1]

Flatten JSON-LD document and optionally compact it using a context

Options:
  -a, --keep-arrays         keep arrays with just one element
  -b, --base=<base>         input document base IRI
  -c, --context=<context>   context IRI to compact the flattened document
  -e, --expand-context=<expandContext>
                            context IRI to expand the document before flattening
  -i, --input=<input>       input document IRI
  -m, --mode=1.0|1.1        processing mode
  -o, --ordered             certain algorithm processing steps are ordered
                              lexicographically
  -p, --pretty              pretty print output JSON