How to handle non-strongly-connected systems

Kerl13 commented 5 years ago

The paper mentions that it is assumed that the system is strongly connected. And indeed on the following (non-strongly-connected) example, paganini fails to set the correct value to Forest although it finds the correct singularity:

from paganini import *

z, Tree, Forest = Variable(), Variable(), Variable()
spec = Specification()

spec.add(Forest, Seq(Tree))
spec.add(Tree, z + Tree * Tree)

spec.run_singular_tuner(z)

print(f"z: {z.value}")
print(f"Tree: {Tree.value}")
print(f"Forest: {Forest.value}")

Output:

z: 0.250000000000019
Tree: 0.499999999999922
Forest: 65541.1663326292

So I have some questions:

Should paganini check that the system is strongly connected (except in the rational case) or is it the user's job?
What should the user do for such systems? Here it's easy to see that the correct value for Forest can be obtained by 1 / (1 - Tree.value), but what about more complex examples?
In case there is a general workaround to the issue, how much work would it require to implement it?

maciej-bendkowski commented 5 years ago

Ad. 1. Currently, paganini does not check if the system is either well-founded or strongly-connected. That could, of course, be implemented in paganini (with an optional --force flag to disable the check) following the work of Pivoteau et. al.

Ad. 2 and 3. I'm not sure what is supposed to happen in this particular case - it's not yet clear if it's a bug or not (probably it is). I have to investigate the system specification.

maciej-bendkowski commented 5 years ago

I verified that the construction of the corresponding optimisation problem is correct. In fact, so are the numerical values obtained from the solver. So, it's not a "bug" per se. Given that the system is not strongly connected, it falls outside of the scope of provably tunable systems. However, it's worth keeping in mind, that some not strongly connected systems can be effectively handled (consider the case of lambda terms in De Bruijn notation). It's just that we do not have (hopefully yet) a formal argument which not strongly connected systems are tunable, merely some intuitions.

I had a discussion with @Electric-tric about your case. As a general workaround for not strongly connected, we would suggest to fix a large, finite size parameter N, and use it instead of the singular tuner. So, for instance, if you are interested in sampling forests, you can set z = Variable(N) and then invoke sp.run_tuner(Forest, params). That should help in some situations, like yours.

Kerl13 commented 5 years ago

Thanks for all these clarifications.

About the checks, I think I would be a valuable addition to paganini since not any user may be aware of its exact scope. I'd be happy to help on this if you think it's worth the effort.

For the purpose of sampling objects, choosing a large value for N seems indeed to be a good idea. But in my use-case I'd like to evaluate some generating functions exactly at the singularity, which, I think, does require to run the singular tuner since a small error on the singularity can incur a large error on the value of the generating function. Am I right on this?

Also, consider the following other example of non-strongly-connected grammar:

A = Apar + Aplus
Apar = Z * Seq(A)
Aplus = Seq(Apar, ≥ 2)

C = Cplus + Cpar
Cpar = Z * Seq(C)
Cplus = Cpar * Seq(Apar, ≥ 1) + Seq(Apar, ≥ 1) * Cpar * Seq(Apar)

This is almost (I removed the Y) the A and \bar{A} grammars from https://www-apr.lip6.fr/~genitrini/publi/fsttcs13_genitrini.pdf (see sections 2.1 and 4.2). This grammar has two strongly connected components:

A, Apar, Aplus with one singularity
C, Cpar, Cplus with a smaller singularity

Is paganini gonna find that the singularity of the whole system should be the one of C? Except for the values of A and C that can be easily fixed, are the other values gonna be correct?

Sorry for the long conversation

maciej-bendkowski commented 5 years ago

Let me answer in order:

I think that some optional sanity checks might be a good feature for paganini.
You are right, a tiny difference in the singularity can induce a not insignificant change in the values of the corresponding generating functions. In theory, if you have a singularity rho, you can still use a numerical oracle to get the exact values of your generating functions (consult the works of Pivoteau et. al.) should you want to be sure.
It's difficult for me to provide meaningful, and general answers once we fall outside of the assumed framework. Already for rational languages, you can obtain almost arbitrary pattern distributions (see Banderier et al., https://hal.archives-ouvertes.fr/hal-00643598/document). Having that in mind, we had quite good reason to start with strongly connected systems. Regarding the specific system in question, yes, paganini finds that the singularity for the whole system is smaller than just for the component associated with A (check it for yourself!). I haven't verified the other values, though.

Sergey-A-Dovgal commented 4 years ago

If we focus only on finite-size tuning (and not the singular one), then the assumption of strong connectivity can be weakened towards another assumption that all the states are reachable from the initial state. We are going to put a small theorem in a new version about this.

As for singular tuning, I am not so optimistic that we still get meaningful results. This is because the singular tuning does not see from which class you are going to sample. In your first example, from the state "Forest" you can reach the "Tree", so if you launch the finite size tuner on Forest, you are ok, but if you do singular sampling on "z", then only "Tree" is visible from "z", and not the "Forest"

Sergey-A-Dovgal commented 4 years ago

@maciej-bendkowski Maybe we can make a small preprocessing procedure on the dependency graph to get rid of non-reachable states when we construct the tuning problem. This is of course for some future enhancement.

maciej-bendkowski commented 4 years ago

@Electric-tric Sure, I agree - we might provide this as a library function.

maciej-bendkowski commented 4 years ago

@Kerl13 I've pushed a new version 1.3.0 (see 519a6429cc05c17a32e2f8e482c82e337dae4f4e). It solves your problem.

With the new version, singular tuning can be passed an additional method parameter controlling the behavior of the tuning procedure. Method.STRICT checks if the given specification matches theoretical requirements (strong-connectivity) and, if so, the singular tuner is used.

Otherwise, if you pass Method.HEURISTIC (enabled by default), some heuristic methods are tried. In some cases, the singular tuner might still be executed, consider for instance the (not strongly connected) lambda-terms specification. In others, as in your specific case, a regular tuner is executed with some arbitrary large size parameter value. Still, there are some cases when even heuristics are not applicable. In such cases, users might resort to Method.JEDI which disables theoretical checks and directly runs the singular sampler -- use it only if the "force is strong with you"!

The new version fixes your particular problem, @Kerl13. Nonetheless, the current fix is not stable enough for me to publish. I am not sure if heuristic methods should be enabled by default. Perhaps using strict methods would provide a better user experience. @Electric-tric?

maciej-bendkowski commented 4 years ago

After some thought, I decided not to provide heuristic methods. I think that they may seem counter-intuitive to inexperienced users. Instead I expanded on the error messages. That should make handling non-strongly connected systems a bit easier.

maciej-bendkowski / paganini

How to handle non-strongly-connected systems #2