quangis / transforge

Describe processes as type transformations, with inference that supports subtypes and parametric polymorphism. Create and query corresponding transformation graphs.
GNU General Public License v3.0
2 stars 0 forks source link

Parse queries from RDF/JSON-LD #77

Closed nsbgn closed 2 years ago

nsbgn commented 2 years ago

Queries are now represented in Python, which was convenient but isn't good for interoperability with the web module and for writing scripts. A JSON representation makes a lot more sense (like: [{"or": ["max","min"]}])

nsbgn commented 2 years ago

YAML is more concise. Also, a representation in RDF would be more uniform and was considered but is maybe for later. A more concise representations of types would be needed (R_Obj.Tuple_Reg.Nom is unwieldy, consider R-Obj~Reg-Nom?) Cf https://www.rfc-editor.org/rfc/rfc3986#page-12

nsbgn commented 2 years ago

Actually, RDF would be good. It would mean we have a unified format for input and output, and the translation from questions would be more straightforward (since that is also provided as a graph, not a tree). And we would get visual representations for free.

nsbgn commented 2 years ago

It is possible to have Turtle-compatible IRIs that are visually quite close to our original notation, like "cct:R‹Obj·Reg⁎Nom›". However, this requires Unicode characters (see https://www.w3.org/TeamSubmission/turtle/#nameChar). I think it's better to keep the IRIs ASCII-compatible so that you can easily type them.

I'm going to try out some ways to represent queries as RDF and test them with rdfpipe.

nsbgn commented 2 years ago

Example:

@prefix from: <http://github.com/quangis/transformation-algebra#from>.
@prefix ta: <http://github.com/quangis/transformation-algebra#>.
@prefix cct: <http://github.com/quangis/cct#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix repo: <http://example.com#>.

_:query
    a ta:Query;
    rdfs:comment "What is the portion of noise ≥ 70dB in Amsterdam?";
    ta:workflows (repo:NoisePortionAmsterdam);
    ta:output <A>.

<A> cct:type "R(Ord, Reg)"; from: <B>.
    <B> cct:type "R(Loc, Ord)"; from: <C>, <E>.
       <C> cct:type "R(Loc, Ord)"; from: <D>.
           <D> cct:type "R(Ord, Reg)".
       <E> cct:type "R(Obj, Reg * Nom)".

Or even:

@prefix : <http://github.com/quangis/transformation-algebra#>.
@prefix from: <http://github.com/quangis/transformation-algebra#from>.
@prefix cct: <http://github.com/quangis/cct#type>.
@prefix op: <http://github.com/quangis/cct#operator>.

[] a :Query;
    :question "What is the portion of noise ≥ 70dB in Amsterdam?";
    :workflow <http://example.com#NoisePortionAmsterdam>;
    :output
        [ cct: "R(Ord, Reg)"; from:
            [ cct: "R(Loc, Ord)"; from:
                [ cct: "R(Loc, Ord)"; from:
                    [ cct: "R(Ord, Reg)" ]
                ],
                [ cct: "R(Obj, Reg * Nom)" ]
            ]
        ].
nsbgn commented 2 years ago

Developments: ec5eae2df1c20b9bc78894adb28fa2c822c700fb reintroduces querying for different aspects.

nsbgn commented 2 years ago

Most of this has been implemented and is now used in the cct repository. See https://github.com/quangis/cct/blob/140e30b00461b71623cee06fa441516433b67773/tasks/01a-NoisePortionAmsterdam.ttl for an example of the expected query format as it is now. I will close this issue once choice has been reimplemented.

nsbgn commented 2 years ago

A limited form of choice is now implemented, which was the final problem missing from this issue.

I say limited: when a step node in a transformation graph is the subject of multiple ta:type or ta:via predicates, then the associated query accepts their UNION. This accommodates choice: if you want to query for a step to go via either operation f or operation g, you can simply do [] ta:via <f>, <g>..

However, it remains impossible to choose not just operators/types but entire branches, where there is an alternative also between subsequent steps. To accommodate this, I think using rdf:Lists makes most sense:

<step1> :from (<step2a> <step2b>).

However, I don't think we will need it. If need arises, it will be the subject of a new issue.

nsbgn commented 2 years ago

Note that this syntax is somewhat inconsistent: <step0> ta:from <step1>, <step2> is interpreted as needing a connection from both <step1> and <step2>, whereas <step0> ta:type <A>, <B> is interpreted as needing to have either <A> or <B>. For consistency, perhaps the syntax should be changed, but there's other things to attend to first.