Closed nsbgn closed 2 years ago
YAML is more concise. Also, a representation in RDF would be more uniform and was considered but is maybe for later. A more concise representations of types would be needed (R_Obj.Tuple_Reg.Nom
is unwieldy, consider R-Obj~Reg-Nom
?) Cf https://www.rfc-editor.org/rfc/rfc3986#page-12
Actually, RDF would be good. It would mean we have a unified format for input and output, and the translation from questions would be more straightforward (since that is also provided as a graph, not a tree). And we would get visual representations for free.
It is possible to have Turtle-compatible IRIs that are visually quite close to our original notation, like "cct:R‹Obj·Reg⁎Nom›". However, this requires Unicode characters (see https://www.w3.org/TeamSubmission/turtle/#nameChar). I think it's better to keep the IRIs ASCII-compatible so that you can easily type them.
I'm going to try out some ways to represent queries as RDF and test them with rdfpipe
.
Example:
@prefix from: <http://github.com/quangis/transformation-algebra#from>.
@prefix ta: <http://github.com/quangis/transformation-algebra#>.
@prefix cct: <http://github.com/quangis/cct#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix repo: <http://example.com#>.
_:query
a ta:Query;
rdfs:comment "What is the portion of noise ≥ 70dB in Amsterdam?";
ta:workflows (repo:NoisePortionAmsterdam);
ta:output <A>.
<A> cct:type "R(Ord, Reg)"; from: <B>.
<B> cct:type "R(Loc, Ord)"; from: <C>, <E>.
<C> cct:type "R(Loc, Ord)"; from: <D>.
<D> cct:type "R(Ord, Reg)".
<E> cct:type "R(Obj, Reg * Nom)".
Or even:
@prefix : <http://github.com/quangis/transformation-algebra#>.
@prefix from: <http://github.com/quangis/transformation-algebra#from>.
@prefix cct: <http://github.com/quangis/cct#type>.
@prefix op: <http://github.com/quangis/cct#operator>.
[] a :Query;
:question "What is the portion of noise ≥ 70dB in Amsterdam?";
:workflow <http://example.com#NoisePortionAmsterdam>;
:output
[ cct: "R(Ord, Reg)"; from:
[ cct: "R(Loc, Ord)"; from:
[ cct: "R(Loc, Ord)"; from:
[ cct: "R(Ord, Reg)" ]
],
[ cct: "R(Obj, Reg * Nom)" ]
]
].
Developments: ec5eae2df1c20b9bc78894adb28fa2c822c700fb reintroduces querying for different aspects.
Most of this has been implemented and is now used in the cct repository. See https://github.com/quangis/cct/blob/140e30b00461b71623cee06fa441516433b67773/tasks/01a-NoisePortionAmsterdam.ttl for an example of the expected query format as it is now. I will close this issue once choice has been reimplemented.
A limited form of choice is now implemented, which was the final problem missing from this issue.
I say limited: when a step node in a transformation graph is the subject of multiple ta:type
or ta:via
predicates, then the associated query accepts their UNION
. This accommodates choice: if you want to query for a step to go via either operation f
or operation g
, you can simply do [] ta:via <f>, <g>.
.
However, it remains impossible to choose not just operators/types but entire branches, where there is an alternative also between subsequent steps. To accommodate this, I think using rdf:List
s makes most sense:
<step1> :from (<step2a> <step2b>).
However, I don't think we will need it. If need arises, it will be the subject of a new issue.
Note that this syntax is somewhat inconsistent: <step0> ta:from <step1>, <step2>
is interpreted as needing a connection from both <step1>
and <step2>
, whereas <step0> ta:type <A>, <B>
is interpreted as needing to have either <A>
or <B>
. For consistency, perhaps the syntax should be changed, but there's other things to attend to first.
Queries are now represented in Python, which was convenient but isn't good for interoperability with the web module and for writing scripts. A JSON representation makes a lot more sense (like:
[{"or": ["max","min"]}]
)