Closed nsbgn closed 2 years ago
Type fixing gets very precarious when the output types of tool applications are derived from the input types.
Some form of that problem will always remain, because we're mixing unifying types (ie saying that x
and y
are the same when they are both variables) and setting subtypes on them (ie saying that x
is a subtype of A
if the latter is a concrete type). That's fairly fundamental. While it's okay to deduce from an application of f: x ** x ** x
to A
that the output type of f
must be x >= A
, we are essentially doing the reverse for sources: when we have f (1: A)
, we try to deduce the input type, which we are unifying with x
. We know from 1: A
that it must be <= A
and from applying f
that it must be >= A
. But that's reasoning about x
as it appears in the type signature for the particular application of f
--- the type of the source input itself can always be more specific when used in another application.
Hope this is somewhat clear for posterity, because it'll definitely be a good candidate for revisiting.
From the previous commit:
It does reduce the errors in generated workflowsfrom github.com/quangis/quangis-workflow-generator; they are from 44 down to 27:
$ transforge graph cct -T build/tools.ttl solutions/s*.ttl --skip-error 2>&1 | grep Skipping | wc -l 27
However, it is not enough to eliminate all errors. I think some errorsare due to sources actually being different, and APE just treating allsources with matching CCD type as being the same?
Now reproduce with:
from cct.language import cct
from transforge import TransformationGraph
from transforge.workflow import WorkflowDict
from transforge.namespace import EX
wf = WorkflowDict(EX.root, {
EX.IDWInterval: (
"interpol (1: PointMeasures) (deify (-: Reg))",
[EX.aed1]
),
EX.ZonalStatisticsMeanInterval: (
"""
1: Field(Itv);
2: ObjectInfo(Nom);
join_attr
(get_attrL 2)
(apply1 (fcont avg 1) (get_attrL 2))
""",
[EX.IDWInterval, EX.ca72]),
EX.SpatialJoinSumTessRatio: (
"""
1: ObjectInfo(Ratio);
2: ObjectInfo(Nom);
join_attr
(get_attrL 2)
(join (get_attrL 2) (groupbyR sum (join_key
(select eq (rTopo
(pi2 (get_attrL 1))
(pi2 (get_attrL 2))
) in)
(getamounts 1)
)))
""",
[EX.ca72, EX.ZonalStatisticsMeanInterval])
}, {EX.ca72, EX.aed1})
g = TransformationGraph(cct)
g.add_workflow(wf)
The problem is that, when types are fixed as the final step of the process rather than immediately after applying, we often end up with types like x [x <= Itv, x >= Itv]
rather than Itv
. When we try to make that a subtype of Nom
, it fails, because Nom <= Itv
fails --- whereas if we just had Itv
, it would have succeeded.
We can change type fixing to happen immediately again, and this does fix our issue. The errors are down to 8, and they look like annotation errors rather than inference errors.
However, this would not be the 'final', proper way to solve it. I think the types of values should not be as intertwined with the types of function applications; this will be another issue.
Reusing sources may cause overly general types to be inferred the first time it's encountered, causing type mismatches the second time around.
The first time we encounter this is in
solution113.ttl
as generated by https://github.com/quangis/quangis-workflow-generator/tree/58735e1372b5a8d13463ece60ac8617d160560cb. To reproduce, do the following:Swapping
expr1
andexpr2
solves the issue.This is because types are fixed as soon as we encounter a source, which is fine if a source is only used once but might be incorrect if it is encountered more often. Passing through the structure twice would solve this.