Termination of the supercompiler

hirrolot commented 6 months ago

Since unrestricted unfolding usually leads to infinite computation, folding or generalization is performed sooner or later to produce a finite graph of configurations. Homeomorphic embedding guarantees that folding/generalization will be performed in an attempt to construct an infinite sequence of terms, but it remains unclear to me why the whole supercompilation algorithm terminates. In particular, why folding/generalization cannot be performed indefinitely?

In the linked papers (taken from the project website), I could not really find a termination proof (or at least general intuition) of first-order supercompilation. Is there a "quick answer" to it, or maybe there are papers that provide intuition on termination of first-order supercompilation? Thanks.

sergei-romanenko commented 6 months ago

The point is that, given a term, there only is a finite number of generalizations for this term. When the whistle blows, a generalization of the "upper" term is performed, and all its children nodes are removed from the graph of configurations. And this can be repeated only a finite number of times.

hirrolot commented 6 months ago

When the whistle blows, a generalization of the "upper" term is performed, and all its children nodes are removed from the graph of configurations.

In one of the papers [^1] it was noted that building a graph without destroying nodes is also a legit strategy:

Fourth, when we perform a GENERALIZE(T, M, N) step we replace node M. Instead one could replace N, since this avoids destroying the whole subtree with root M; other branches from M can be retained with no loss of information.

So maybe there is something else that guarantees termination? Also, for example, we have folding and splitting (in SPSC as well), which don't destroy upper nodes.

[^1]: Robert Glück and Morten Heine Sørensen. 1996. A Roadmap to Metacomputation by Supercompilation. In Selected Papers from the International Seminar on Partial Evaluation. Springer-Verlag, Berlin, Heidelberg, 137–160.

sergei-romanenko commented 6 months ago

That's right. Actually, rebuilding nodes is an optimization that reduces the size of residual programs.

Instead of rebuilding the upper configuration, we may just generalize the lower configuration. The process tree is finite if all branches in the tree are finite. And this is due to the fact that the number of possible generalizations of a term is finite.

It's interesting that the whistle and the generalization can be decoupled! This is done in the case of "multi-result" supercompilation.

I. G. Klyuchnikov, S. A. Romanenko, “MRSC: a toolkit for building multi-result supercompilers”, Keldysh Institute preprints, 2011, 077, 30 pp. https://www.mathnet.ru/php/archive.phtml?wshow=paper&jrnid=ipmp&paperid=183&option_lang=eng

hirrolot commented 6 months ago

And this is due to the fact that the number of possible generalizations of a term is finite.

Yes, but how does this ensure that the process tree is finite? For example, why can't we generalize/fold/split two nodes M and N, then continue supercompiling the extracted subterms, then generalize/fold/split at some point again (possibly different nodes!), and so on without end?

I suspect that this is because the extracted subterms will be "smaller" in some sense. Of course, they are syntactically smaller initially, but they can grow in size as a result of further unfolding, in which case the whistle blows and folding/generalization is performed again, possibly against a different parent node.

sergei-romanenko commented 6 months ago

Splitting is not done in an arbitrary way. Its purpose is to create a loop in the process tree.

hirrolot commented 5 months ago

During construction of a process tree, we ensure that, in no descending sequence of terms, homeomorphic embedding holds true for a parent and a child term. Therefore, every descending sequence of terms in the process tree is finite, so the whole process tree is finite. The process of supercompilation therefore terminates. At the same time, we don't check generalization nodes because if we exclude them from any infinite sequence, the sequence will still be infinite, so homeomorphic embedding will apply for it as well.

sergei-romanenko / spsc

Termination of the supercompiler #7