vermaseren / form

The FORM project for symbolic manipulation of very big expressions
GNU General Public License v3.0
1.16k stars 138 forks source link

Pattern ?a,p?,?b not matching in nested functions #98

Open benruijl opened 8 years ago

benruijl commented 8 years ago

The following program does not match on the first pattern, but does on the second:

V p,p1,p2;
CF f,f1;

L F = f1(p1)*f(f(p1,p2));

id f(f(?a,p?,?b))*f1(p?) = 1; * does not match!
id f(f(p?,?b))*f1(p?) = 2;

Print +s;
.end

If I remove p2 or rename f1 to f, it does match on the first id.

tueda commented 8 years ago

This is a variation of the well-known issue, right? https://github.com/vermaseren/form/issues/12#issuecomment-77137755

benruijl commented 8 years ago

Indeed, the origin of the bug is likely the same. This example does not require the symmetric property of the subfunction, however.

Op vr 27 mei 2016 18:37 schreef Takahiro Ueda notifications@github.com:

This is a variation of the well-known issue, right? #12 (comment) https://github.com/vermaseren/form/issues/12#issuecomment-77137755

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/98#issuecomment-222194073, or mute the thread https://github.com/notifications/unsubscribe/AARGGdsM6x9TCwrn4EsqesgO3NFxKV4Tks5qFx3CgaJpZM4IoiMA .

benruijl commented 7 years ago

Jos, could you have a look at this problem? Below is another example.

CF f2,f1; * bug
*CF f1,f2; * works
S n,n1,n2,n3;

L F = f2(f2(n1,n2,n3))*f1(n2);

id f2(f2(?a,n?,?b))*f1(n?) = 1;

Print +s;
.end
vermaseren commented 7 years ago

I am not sure there is an ‘easy’ fix for this. Things are rather complicated there in the backtracking. It is not impossible that the whole philosophy of the pattern matching would need changes as it was never designed for anything this complex.

Jos

On 15 jun. 2017, at 18:03, Ben Ruijl notifications@github.com wrote:

Jos, could you have a look at this problem? Below is another example.

CF f2,f1; bug CF f1,f2; * works S n,n1,n2,n3;

L F = f2(f2(n1,n2,n3))*f1(n2);

id f2(f2(?a,n?,?b))*f1(n?) = 1;

Print +s; .end — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/98#issuecomment-308785595, or mute the thread https://github.com/notifications/unsubscribe-auth/AFLxEl_MC0brXsvxeN4qKIx-yDN1j-o4ks5sEVXngaJpZM4IoiMA.

benruijl commented 7 years ago

Perhaps we should print a warning for such a pattern then? Sometimes it works, other times it does not.

Op do 15 jun. 2017 22:20 schreef Jos Vermaseren notifications@github.com:

I am not sure there is an ‘easy’ fix for this. Things are rather complicated there in the backtracking. It is not impossible that the whole philosophy of the pattern matching would need changes as it was never designed for anything this complex.

Jos

On 15 jun. 2017, at 18:03, Ben Ruijl notifications@github.com wrote:

Jos, could you have a look at this problem? Below is another example.

CF f2,f1; bug CF f1,f2; * works S n,n1,n2,n3;

L F = f2(f2(n1,n2,n3))*f1(n2);

id f2(f2(?a,n?,?b))*f1(n?) = 1;

Print +s; .end — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/vermaseren/form/issues/98#issuecomment-308785595>, or mute the thread < https://github.com/notifications/unsubscribe-auth/AFLxEl_MC0brXsvxeN4qKIx-yDN1j-o4ks5sEVXngaJpZM4IoiMA .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/98#issuecomment-308845505, or mute the thread https://github.com/notifications/unsubscribe-auth/AARGGclaLAJZLUYUFxrm_BMZrYiAsdSQks5sEYiggaJpZM4IoiMA .

vermaseren commented 7 years ago

Recognizing such patterns that “may not work” is already an art by itself. In this case, if it matches f1 first things go well, but if it tries f2 first, the first attempt is with ?b maximal and ?a minimal and then n matches n1 and the f1 match fails. But because the fact that the ?a,n?,?b is one function level deeper it looks like it cannot backtrack into this. A warning in the manual seems better. It may be possible to determine an order of function matching. That would have to be in the file smart.c and then maybe a relatively minor change may work, but it would always be possible to make examples that will still not work. The generic workaround would be id f2(f2(?a)) = f2(f2,?a); and then id f2(f2,?a,n?,?b)*f1(n?) = …… and variations on this theme. That should also be faster although that might be withing the measuring error.

One should consider that originally already ?a,n?,?b inside functions was extremely advanced. Its presence and easy use makes it logical to try even more complicated things, but there comes a point that the original setup becomes too cumbersome and needs rethinking. But it would not be easy as the current pattern matcher is about 11000 lines of code (incl commentary) (288Kbytes) although not all would need replacement. (The complete sources are almost 4 Mbytes).

Jos

On 15 jun. 2017, at 22:25, Ben Ruijl notifications@github.com wrote:

Perhaps we should print a warning for such a pattern then? Sometimes it works, other times it does not.

Op do 15 jun. 2017 22:20 schreef Jos Vermaseren notifications@github.com:

I am not sure there is an ‘easy’ fix for this. Things are rather complicated there in the backtracking. It is not impossible that the whole philosophy of the pattern matching would need changes as it was never designed for anything this complex.

Jos

On 15 jun. 2017, at 18:03, Ben Ruijl notifications@github.com wrote:

Jos, could you have a look at this problem? Below is another example.

CF f2,f1; bug CF f1,f2; * works S n,n1,n2,n3;

L F = f2(f2(n1,n2,n3))*f1(n2);

id f2(f2(?a,n?,?b))*f1(n?) = 1;

Print +s; .end — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/vermaseren/form/issues/98#issuecomment-308785595>, or mute the thread < https://github.com/notifications/unsubscribe-auth/AFLxEl_MC0brXsvxeN4qKIx-yDN1j-o4ks5sEVXngaJpZM4IoiMA .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/98#issuecomment-308845505, or mute the thread https://github.com/notifications/unsubscribe-auth/AARGGclaLAJZLUYUFxrm_BMZrYiAsdSQks5sEYiggaJpZM4IoiMA .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/98#issuecomment-308855992, or mute the thread https://github.com/notifications/unsubscribe-auth/AFLxEiTW2mhcV6wfqxZacWwDuljbjtokks5sEZMugaJpZM4IoiMA.

tueda commented 7 years ago

I feel warnings in the manual or runtime could be weak in the sense they can be easily ignored or not noticed. Immediate aborting is better than getting wrong result after one month run.

vermaseren commented 7 years ago

But how should FORM know that it should have matched but did not?

Jos

On 15 jun. 2017, at 23:32, Takahiro Ueda notifications@github.com wrote:

I feel warnings in the manual or runtime is weak in the sense they can be easily ignored or not noticed. Immediate aborting is better than getting wrong result after one month run.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/98#issuecomment-308871737, or mute the thread https://github.com/notifications/unsubscribe-auth/AFLxEs_byvTSYDRaFh-9TI3v3fRI42hAks5sEaLpgaJpZM4IoiMA.

benruijl commented 7 years ago

FORM could automatically abort the program when a pattern such as f(f(?a,p?,?b))*f2(p?) occurs.

Op vr 16 jun. 2017 om 08:48 schreef Jos Vermaseren <notifications@github.com

:

But how should FORM know that it should have matched but did not?

Jos

On 15 jun. 2017, at 23:32, Takahiro Ueda notifications@github.com wrote:

I feel warnings in the manual or runtime is weak in the sense they can be easily ignored or not noticed. Immediate aborting is better than getting wrong result after one month run.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/vermaseren/form/issues/98#issuecomment-308871737>, or mute the thread < https://github.com/notifications/unsubscribe-auth/AFLxEs_byvTSYDRaFh-9TI3v3fRI42hAks5sEaLpgaJpZM4IoiMA .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/98#issuecomment-308947811, or mute the thread https://github.com/notifications/unsubscribe-auth/AARGGRxU_U5Y3qbYtY5xuMgRPveY2Kefks5sEiVGgaJpZM4IoiMA .

vermaseren commented 7 years ago

Recognizing the problem is already partially solving it. Here you have one example. Of course that can be programmed. And how many more are there?

I think it is the responsibility of the person who writes the FORM code to check that the pattern does indeed work. The manual can give some examples of what works and what does not work and why.

Eventually a new pattern matcher may have to be written. One that is backward compatible for as far as documented features concerns. If it is completely redesigned one may get away with having to rewrite just a few thousand lines of code. Probably wildcard.c does not need too many modifications (if any). And maybe the file that deals with symmetries also not. And a new setup may be much compacter.

Jos

On 16 jun. 2017, at 10:40, Ben Ruijl notifications@github.com wrote:

FORM could automatically abort the program when a pattern such as f(f(?a,p?,?b))*f2(p?) occurs.

Op vr 16 jun. 2017 om 08:48 schreef Jos Vermaseren <notifications@github.com

:

But how should FORM know that it should have matched but did not?

Jos

On 15 jun. 2017, at 23:32, Takahiro Ueda notifications@github.com wrote:

I feel warnings in the manual or runtime is weak in the sense they can be easily ignored or not noticed. Immediate aborting is better than getting wrong result after one month run.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/vermaseren/form/issues/98#issuecomment-308871737>, or mute the thread < https://github.com/notifications/unsubscribe-auth/AFLxEs_byvTSYDRaFh-9TI3v3fRI42hAks5sEaLpgaJpZM4IoiMA .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/98#issuecomment-308947811, or mute the thread https://github.com/notifications/unsubscribe-auth/AARGGRxU_U5Y3qbYtY5xuMgRPveY2Kefks5sEiVGgaJpZM4IoiMA .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/98#issuecomment-308968895, or mute the thread https://github.com/notifications/unsubscribe-auth/AFLxEmp6cs8UWdSr74Ack71y1aC_H1tcks5sEj95gaJpZM4IoiMA.

tueda commented 7 years ago

It seems difficult to tell FORM such patterns. So, as an easier way, I made a prototype for a lint to detect such cases: https://gist.github.com/tueda/489f55142c212aa80ce54f1256028d5e. For now only very simple cases are implemented. Set specification and $-variables for wildcards, preprocessor substitutions are not recognized. I have no ideas to detect bugs with symmetric functions, because it doesn't parse the whole context in FORM. But may be a starting point. Example:

id f(f(?a,p?,?b))*f1(p?) = 1;
id f2(f2(?a,n?,?b))*f1(n?) = 1;
id path(path(?a,p?,?b),?c)*f1(p1?,?d,p?,?e) = path(?c)*f1(p1,?d,p,?e);
$ formlint.py test.frm
test.frm:1: P102 may-not-work pattern: f1(f2(?a,x?,?b))*f3(x?) for p?
  >> f(f(?a,p?,?b))*f1(p?)
test.frm:2: P102 may-not-work pattern: f1(f2(?a,x?,?b))*f3(x?) for n?
  >> f2(f2(?a,n?,?b))*f1(n?)
test.frm:3: P102 may-not-work pattern: f1(f2(?a,x?,?b))*f3(x?) for p?
  >> path(path(?a,p?,?b),?c)*f1(p1?,?d,p?,?e)
vermaseren commented 7 years ago

OK, let us see how far you get.

I have been thinking as well about how a new pattern matcher should work. The current one steps through the subterms, trying to match them and if a wildcard does not work it may backtrack. The problem there is that if a function has functions with argument field wildcards in its arguments it would have to backtrack two levels and that it cannot do. Similarly, when it is finished with the functions it starts doing the symbols and dotproducts and vectors and from that moment it cannot backtrack into the functions. Hence the future pattern matcher should concentrate more on the wildcards and make a recursion wrt to the wildcard assignments rather than wrt to the subterms cq arguments. It should first do all objects without wildcards, then the regular wildcards and finally the argument field wildcards. It requires a completely different organization of what has been done and what has not yet been done. Hence it still needs quite some working out. But this system would not suffer from any restrictions for as far as I can see (at the moment). The old way will keep having problems which we may be solving one by one, but it is a job that will not easily finish. My guess is that the new method will also be more robust (because it is a new generation after 30+ years of development).

Jos

On 17 jun. 2017, at 10:40, Takahiro Ueda notifications@github.com wrote:

It seems difficult to tell FORM such patterns. So, as an easier way, I made a prototype for a lint to detect such cases: https://gist.github.com/tueda/489f55142c212aa80ce54f1256028d5e https://gist.github.com/tueda/489f55142c212aa80ce54f1256028d5e. For now only very simple cases are implemented. Set specification and $-variables for wildcards, preprocessor substitutions are not recognized. I have no ideas to detect bugs with symmetric functions, because it doesn't parse the whole context in FORM. But may be a starting point:

id f(f(?a,p?,?b))f1(p?) = 1; id f2(f2(?a,n?,?b))f1(n?) = 1; id path(path(?a,p?,?b),?c)f1(p1?,?d,p?,?e) = path(?c)f1(p1,?d,p,?e); $ formlint.py test.frm test.frm:1: P102 may-not-work pattern: f1(f2(?a,x?,?b))*f3(x?) for p?

f(f(?a,p?,?b))f1(p?) test.frm:2: P102 may-not-work pattern: f1(f2(?a,x?,?b))f3(x?) for n? f2(f2(?a,n?,?b))f1(n?) test.frm:3: P102 may-not-work pattern: f1(f2(?a,x?,?b))f3(x?) for p? path(path(?a,p?,?b),?c)*f1(p1?,?d,p?,?e) — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/98#issuecomment-309202260, or mute the thread https://github.com/notifications/unsubscribe-auth/AFLxEpD7ydQDEGQyk6JYrWKsfykZVsHjks5sE5D6gaJpZM4IoiMA.