quangis / quangis-workflow

Tools to describe GIS workflows semantically, and to generate them. Includes the core concept transformation algebra (CCT).
GNU General Public License v3.0
1 stars 0 forks source link

Recognize permutations of inputs/outputs #8

Open nsbgn opened 1 year ago

nsbgn commented 1 year ago

It happens sometimes that multiple signatures are created that are the same, just with the order of the inputs shuffled. The same happens for supertools, and even inside the actions of supertools. Consider:

supertool:SelectLayerByLocationDistTessObject a :Supertool ;
    :action
        [ :apply arc:export-features.htm ;
            :inputs ( _:d3 ) ;
            :outputs ( _:d2 ) ],
        [ :apply arc:select-layer-by-location.htm ;
            :inputs ( _:d1 _:d0 ) ;
            :outputs ( _:d3 ) ] ;
    :inputs ( _:d0 _:d1 ) ;
    :outputs ( _:d2 ) .

supertool:SelectLayerByLocationTessObject a :Supertool ;
    :action
        [ :apply arc:select-layer-by-location.htm ;
            :inputs ( _:d0 _:d1 ) ;
            :outputs ( _:d3 ) ],
        [ :apply arc:export-features.htm ;
            :inputs ( _:d3 ) ;
            :outputs ( _:d2 ) ] ;
    :inputs ( _:d0 _:d1 ) ;
    :outputs ( _:d2 ) .

While the annotator could be made responsible for being consistent with order, the point of drawing from manual annotations was to avoid such mistakes --- so it should really be automatically recognized.

Should we take into account input order for concrete tools? We could drop the list structure there. That would instantly remove extraneous supertools, but not extraneous signatures.

nsbgn commented 1 year ago

The ordering on action inputs/outputs can probably be dropped. However, be careful: a supertool with :inputs (_:d1 _:d2) will become isomorphic to a supertool with :inputs ( _:d2 _:d1 ), so this might still lead to mixups.

nsbgn commented 1 year ago

We need to think harder about where order is relevant and where it is not. The situation right now is as follows:

The ordering of tools is used in these ways:

Therefore, the only place where the numbering is really essential is in workflow actions (where the action can be associated with a signature). Using labels instead of lists or numbered predicates would afford us some flexibility:

The downside is that it's a little more verbose. For illustration, consider:

Was:

wf:_1 a wf:Workflow;
    wf:source _:d1, _:d2;
    wf:target _:d3;
    wf:edge [
        wf:applicationOf signature:_1;
        wf:input1 _:d1;
        wf:input2 _:d2;
        wf:output _:d3
    ].

supertool:_1 a :Supertool;
    :inputs ( _:d1 _:d2 );
    :outputs ( _:d3 );
    :action [
        :apply supertool:_1;
        :inputs ( _:d1 _:d2 );
        :outputs ( _:d3 )
    ].

signature:_1 a :Signature;
    :inputs ( [ a ccd:Type ] [ a ccd:Type ] );
    :outputs ( [ a ccd:Type ] );
    cct:expression "f 1 2";
    :implementation supertool:_1.

Becomes:

wf:_1 a :Workflow;
    :input _:d1, _:d2;
    :output _:d3;
    :action [
        :apply signature:_1;
        :input [ :id "1"; :as _:d1 ]
        :input [ :id "2"; :as _:d2 ]
        :output _:d3
    ].

supertool:_1 a :Supertool;
    :input [ :id "1"; :as _:d1 ];
    :input [ :id "2"; :as _:d2 ];
    :output _:d3;
    :action [
        :apply supertool:_1;
        :input _:d1, _:d2;
        :output _:d3
    ].

signature:_1 a :Signature;
    :input [ :id "1"; a ccd:Type ];
    :input [ :id "2"; a ccd:Type ];
    :output [ a ccd:Type ];
    cct:expression "f 1 2";
    :implementation supertool:_1.

This would partially reinstate the changes reverted in https://github.com/quangis/quangis-workflow/commit/96bb7fe730d8a4a281bb9184057c910dd6db73a8.

An alternative would be to use :inputs when order is relevant and :input when it's not. :inputs ( x ... ) would of course imply :input x. This has some of the benefits of using :ids, but not all.

wf:_1 a :Workflow;
    :input _:d1, _:d2;
    :output _:d3;
    :action [
        :apply signature:_1;
        :inputs ( _:d1 _:d2 );
        :output _:d3
    ].

supertool:_1 a :Supertool;
    :input ( _:d1 _:d2 );
    :output _:d3;
    :action [
        :apply supertool:_1;
        :input _:d1, _:d2;
        :output _:d3
    ].

signature:_1 a :Signature;
    :input ( [ a ccd:Type ] [ a ccd:Type ] );
    :output [ a ccd:Type ];
    cct:expression "f 1 2";
    :implementation supertool:_1.
nsbgn commented 1 year ago

A distinction between ConcreteArtefacts (for Workflows) and SchematicArtefacts (for Supertools and Signatures) makes this easier to work with. Combine this with an explicit Label object for ConcreteActions.

wf:_1 a :Workflow;
    :source _:d0, _:d1;
    :action [
        :apply signature:_1;
        :input _:d0;
        :output _:d1;
        :label [ :id "1", :for _:d0 ]
    ];
    :action [
        :apply signature:_2;
        :input _:d1, _:d2;
        :output _:d3;
        :label [ :id "1"; :for _:d1 ],
            [ :id "2"; :for _:d2 ]
    ].

supertool:_1 a :Supertool;
    :action [
        :apply tool:_1;
        :input [ :id "1" ];
        :output _:d3
    ], [
        :apply tool:_2;
        :input _:d3, [ :id "2" ];
        :output [ rdfs:label "final output" ]
    ].

signature:_1 a :Signature;
    :input [ :id "1"; a ccd:Type ];
    :input [ :id "2"; a ccd:Type ];
    :output [ a ccd:Type ];
    cct:expression "f 1 2";
    :implementation supertool:_1.
nsbgn commented 1 year ago

Note also that the original supertools labelled its action inputs (which is unnecessary) but did not label its own inputs, which is necessary. This may lead to wrong identification of supertools later on.

nsbgn commented 1 year ago

In #11, permuted subsuming CCD signatures are now detected after-the-fact, but not yet while adding new tools, which is what this issue is about.

nsbgn commented 1 year ago

We don't just need to know whether there is a permutation that matches, but also which one matches.