vermaseren / form

The FORM project for symbolic manipulation of very big expressions
GNU General Public License v3.0
982 stars 118 forks source link

Fix polyratfun bugs caused by incorrect clean flag #482

Closed jodavies closed 3 months ago

jodavies commented 4 months ago

There are some known issues regarding polyratfun which are not easy to fix.

The least we can do is to crash when we hit something that is definitely wrong, for now.

This "fixes" the example in Issue #336

This does not catch Issue #270 !

jodavies commented 4 months ago

Looking through the issues, I think that the known open polyratfun bugs are now caught and terminate form, so the user can adjust their script to avoid them, though of course they are not fixed. Without looking at the complicated interactions of polyratfun with collect, argument etc in more detail, this seems like a good intermediate change to make, for safety. What do you think?

jodavies commented 4 months ago

At least with @tueda 's benchmark ( https://gist.github.com/tueda/8cabf511573b115b9c17a7a181bf0248 ) there is no measurable performance difference due to this.

tueda commented 4 months ago

The code compares the powers of the first variable in the first and last terms. It is true that at least it catches problems in particular cases, e.g.,

CF rat,rat1,f;
S x,y;
L F = rat(1,1+x)*rat1(1,1+x^2);
id rat1(?a) = f(rat(?a));
.sort

PolyRatFun rat;
print "1: %t";
id f(x?) = x;
print "2: %t";
.end

But I'm not sure what really causes the error in the following sense: I can't make a simple case that is wrong but can't be caught by this check. For example, if I give F = rat(1+x,1+y)*rat1(1+x^2,1+y^2); then the order in f(rat(...)) is wrong but somehow the check didn't hit anything (maybe poly_ratfun_read works well for multivariate cases??) and indeed the result is correct:

1:  + rat(x + 1,y + 1)*f(rat(1 + x^2,1 + y^2))
2:  + rat(x^3 + x^2 + x + 1,y^3 + y^2 + y + 1)

If we can find more precise conditions under which errors occur, the check may become simpler and lighter.

jodavies commented 4 months ago

Interesting. The second variable doesn't even need to go in rat1. rat(y,1+x) * rat1(1,1+x^2) has no issues apparently.

This should be causing poly_num_vars to be 2, and so https://github.com/vermaseren/form/blob/b1f9041b7bb6f5dd4a149c87e70922ad9914b291/sources/poly.cc#L2567-L2568 always normalizes the poly regardless of whether the arg is marked clean or not in poly_ratfun_read.

So actually it seems like the issues can all be fixed properly, and the checks removed, by unconditionally calling normalize() here also in the univariate case? This will slow things down in univariate cases (by an amount to be determined in benchmarks...) but actually it will not affect multivariate polyratfun performance?

jodavies commented 4 months ago

For @tueda 's benchmark (with default N=30), on my system I get 17.90+-0.17s -> 18.03+-0.16s by unconditionally calling normalize().

tueda commented 4 months ago

How about some simple checks like the following pseudocode:

if (!sort_univar)
  if (is_univariate)
    if (number_of_terms >= 2)
      if (power_of_variable_in_1st_term < power_of_variable_in_2nd_term)
        sort_univar = true;
vermaseren commented 4 months ago

This needs some gymnastics with dollar variables, but it can be done.

On 1 Mar 2024, at 09:37, Takahiro Ueda @.***> wrote:

How about some simple check like

if (is_univariate) if (number_of_terms >= 2) if (power_of_variable_in_1st_term < power_of_variable_in_2nd_term) sort_univar = true; — Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/pull/482#issuecomment-1972748806, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJPCETTDLWLGIC73AVDGVDYWA425AVCNFSM6AAAAABD2PZUHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZSG42DQOBQGY. You are receiving this because you are subscribed to this thread.

tueda commented 4 months ago

I meant adding a simple check that is negligible in time for Josh's unconditional call of normalize() in poly.cc.

vermaseren commented 4 months ago

Sorry about that then.

On 1 Mar 2024, at 10:14, Takahiro Ueda @.***> wrote:

I meant adding a simple check that is negligible in time for Josh's unconditional calling normalize() in poly.cc.

— Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/pull/482#issuecomment-1972806988, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJPCESOLZWIDDSCSAD4KA3YWBBI5AVCNFSM6AAAAABD2PZUHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZSHAYDMOJYHA. You are receiving this because you commented.

jodavies commented 4 months ago

I'll look at this and check performance. It is not quite so simple, checking powers of first and second terms fails in the case of rat(1,1+x^2+x^-1) * rat1(1,1+x^2+x^-1).

Edit: I did not appreciate this before. In this case, not even on highfirst can save you. Neither of FORM's sort orderings are actually what the poly class wants.

jodavies commented 4 months ago

Here is a new try. Checking first and last monomials should work also for cases where symbols appear with negative exponent. Times: 17.80+-0.10 -> 17.87+-0.17 (20 samples).

tueda commented 3 months ago

It gives the correct answer for rat(1,1+x^2+x^-1)*rat1(1,1+x^2+x^-1) but not for rat(1,1+x)*rat1(1,1+x^2+1/x). (Is it an expected limitation of your poly.cc code?)

CF rat,rat1,rat2,f1,f2;
S x,y,n;

L F = rat(1,1+x)*rat1(1,1+x^2+1/x);

L G = F * replace_(rat1,rat2);
id rat1?{rat1,rat2}[n](?a) = {f1,f2}[n](rat(?a));
.sort

PolyRatFun rat;
P "1: %t";
argument f2;
  argument rat;
  endargument;
endargument;
P "2: %t";
id f1?{f1,f2}(x?) = x;
P "3: %t";

.sort
Skip;
L ZERO = F - G;
P;
.end
FORM 5.0.0-beta.1 (Mar  1 2024, v5.0.0-beta.1-28-ge58eac0)  Run: Wed Mar  6 14:03:01 2024
...
1:  + rat(1,x + 1)*f1(rat(1,1 + x^-1 + x^2))
2:  + rat(1,x + 1)*f1(rat(1,1 + x^-1 + x^2))
3:  + rat(1,x^3 + x^2 + x + 2 + 1)
...
1:  + rat(1,x + 1)*f2(rat(1,1 + x^-1 + x^2))
2:  + rat(1,x + 1)*f2(rat(x,x^3 + x + 1))
3:  + rat(x,x^4 + x^3 + x^2 + 2*x + 1)
...
   ZERO =
      rat( - x + 1,x^7 + 2*x^6 + 3*x^5 + 7*x^4 + 7*x^3 + 6*x^2 + 7*x + 3);
jodavies commented 3 months ago

No it's not expected, good catch. I'll take a look.

jodavies commented 3 months ago

I am not sure what to make of this one. In the rat(1,1+x+x^-1) * rat1(1,1+x+x^-1) case, the rat denominator is passed to poly::argument_to_poly as 13 4 1 1 1 4 -1 1 1 4 0 1 1 with sort_univar=true, and normalize produces 13 4 1 1 1 4 0 1 1 4 -1 1 1. The rat1 denominator is passed in as 13 4 0 1 1 4 -1 1 1 4 1 1 1 (I am not sure why this is different) with sort_univar=false but the new code sets it to true and we normalize again to 13 4 1 1 1 4 0 1 1 4 -1 1 1.

In the rat(1,1+x) * rat1(1,1+x+x^-1) case, the denominator of rat1 is passed just as before and, with the new code, normalized as before. And yet the result is not correct. The term print there looks like 2: + rat(1,x^2 + 2*x + 2 + 1).

jodavies commented 3 months ago

OK, now I think I follow it. poly_ratfun_read in polywrap.cc also needs to know whether normalization is required. This is where the "denominator of the denominator" is multiplied into the numerator and vice versa. So the check for bad sort ordering needs to move here instead, before calling argument_to_poly. If I force clean=false there, the new "bad examples" are also correct.

jodavies commented 3 months ago

This was quite easy to fix actually; just check for negative powers in the normalized num and den unconditionally, and then set the clean flag if they appear.

For the checks in poly.cc, personally I like it better as separate code since you can see the change compared to the old code better. But if you strongly prefer it all in the one if statement I can do that.

I am fairly sure we now catch the bad cases here. If you're happy I will squash all of these commits.

tueda commented 3 months ago

For the checks in poly.cc, personally I like it better as separate code since you can see the change compared to the old code better. But if you strongly prefer it all in the one if statement I can do that.

OK. That's fine.

This was quite easy to fix actually; just check for negative powers in the normalized num and den unconditionally, and then set the clean flag if they appear.

The check is now performed with or without the clean flag. No performance loss?

jodavies commented 3 months ago

Presumably there is a very small loss, I will try to measure. I think you do need to check all of the powers though, since in multivariate cases you can't just check the last term of the polynomial, since the negative power might be in a variable which is sorted earlier, and appear in the middle of the terms.

Edit: I don't see any measurable performance difference: 4.3.1: 17.9898+-0.1483, ce81b14: 17.9160+-0.1525.

jodavies commented 3 months ago

I ran a multivariate benchmark and also can't measure any performance loss: 4.3.1: 42.6300+-0.3961, https://github.com/vermaseren/form/commit/ce81b145886b7e0b5a6bea692285e6ef9ad74069: 42.9980+-0.3934.

jodavies commented 3 months ago

Do you feel there is anything further to add to this one? I think it is in a pretty good shape now.

tueda commented 3 months ago

Unless there is some performance loss for other use cases, I think this is fine. It should be merged before the next (beta?) version release.