MethodicalAcceleratorDesign / MAD-NG

MAD Next-Generation official repository
GNU General Public License v3.0
32 stars 11 forks source link

dev-tpsa-newlazy issues #436

Closed mattsignorelli closed 6 months ago

mattsignorelli commented 6 months ago

I've found two problems so far. They are kind of obscure and appear only with specific orders of operations, but these are the minimal working examples I have at the moment:

Say you have NV=2, MO=10. Let tpsa x correspond to the first variable ( so basically seti(x, 1, 0, 1) ) and tpsa y correspond to the second variable (so seti(y, 2, 0, 1)) :

1) Let a = (1+x) and b = y^2. Then b/a/a (and b/a*(1/a)) gives an incorrect result (leftover x^2 term), whereas b/(a*a), b/(a^2), and even (1/a)*b/a give correct results

Also, if the temporaries in the descriptor are used for the entire expression (y^2/(1+x)/(1+x) without allocating intermediate a and b) then the result is correct. If the temporaries are not used for the entire expression, the result is still incorrect

2) -x^2/(1+y) gives inconsistent results each evaluation, sometimes swapping the exponents for monomials, other times containing junk, adding in random factors to certain monomials, etc. For example the Julia output (which uses cycle instead of print to show the tpsa) is:

julia> x
TPS:
 Coefficient                Order   Exponent
  1.0000000000000000e+00      1      1   0

julia> y
TPS:
 Coefficient                Order   Exponent
  1.0000000000000000e+00      1      0   1

julia> -x^2/(1+y)
TPS:
 Coefficient                Order   Exponent
 -1.0000000000000000e+00      2      2   0
  1.0000000000000000e+00      3      2   1
 -1.0000000000000000e+00      4      2   2
  1.0000000000000000e+00      5      2   3
 -1.0000000000000000e+00      6      2   4
  1.0000000000000000e+00      7      2   5
 -1.0000000000000000e+00      8      2   6
  1.0000000000000000e+00      9      2   7
 -1.0000000000000000e+00     10      2   8

julia> -x^2/(1+y)
TPS:
 Coefficient                Order   Exponent
 -1.8916802897889015e+272      2      0   2
  1.0000000000000000e+00       3      2   1
 -1.0000000000000000e+00       4      2   2
  1.0000000000000000e+00       5      2   3
 -1.0000000000000000e+00       6      2   4
  1.0000000000000000e+00       7      2   5
 -1.0000000000000000e+00       8      2   6
  1.0000000000000000e+00       9      2   7
 -1.0000000000000000e+00      10      2   8

julia> -x^2/(1+y)
TPS:
 Coefficient                Order   Exponent
 -1.0000000000000000e+00      2      2   0
  1.0000000000000000e+00      3      2   1
 -1.0000000000000000e+00      4      2   2
  1.0000000000000000e+00      5      2   3
 -1.0000000000000000e+00      6      2   4
  1.0000000000000000e+00      7      2   5
 -1.0000000000000000e+00      8      2   6
  1.0000000000000000e+00      9      2   7
 -1.0000000000000000e+00     10      2   8

julia> -x^2/(1+y)
TPS:
 Coefficient                Order   Exponent
 -1.0000000000000000e+00      2      2   0
 -1.0000000000000000e+00      2      1   1
  1.0000000000000000e+00      3      2   1
 -1.0000000000000000e+00      4      2   2
  1.0000000000000000e+00      5      2   3
 -1.0000000000000000e+00      6      2   4
  1.0000000000000000e+00      7      2   5
 -1.0000000000000000e+00      8      2   6
  1.0000000000000000e+00      9      2   7
 -1.0000000000000000e+00     10      2   8
ldeniau commented 6 months ago

bug in mul fixed in c4cc9f6e, same for both cases. bug in seti fixed in d29aecf7 Map (complex) inversion still exhibit a bug...

mattsignorelli commented 6 months ago

Ok, now with tpsa x, say NV=1 and MO=5, I get the following bug for sin(x):

mad_tpsa_update:212: 't' { lo=1 hi=3 mo=5 ao=5 uid=0 did=0 ** bug @ o=2 }
 [0:0]=+0.0000E+00
 [1:1]=+1.0000E+00
 [2:2]=+0.0000E+00
 [3:3]=-1.6667E-01
 [4:4]=+2.1700E-28
 [5:5]=+1.0863E-312

I also see this with at least cos(x), tan(x) too, probably more (presumably bug in mad_tpsa_update?)

ldeniau commented 6 months ago

The output is not a bug, it's the check function in debug that is too strict... I will modify it to allow zero homogeneous polynomials between lo and hi. With only one variable this occurs easily as the polynomials have only one monomial... Nevertheless, sin(x) should generate values up to order 5, not 3... I'll check update too.

ldeniau commented 6 months ago

I fixed the check function to allow zero homogeneous polynomials in the range (lo,hi), and I added a density calculation to see if sparse TPSA representation would be more efficient. On the other hand I could not reproduce the problem mentioned for sin(x),cos(x) or tan(x) for one and two variables...

sin(x)    :  R, NV =   1, MO = 10
 *******************************************************
     I   COEFFICIENT             ORDER   EXPONENTS
     1   1.0000000000000000E+00    1     1
     2  -1.6666666666666666E-01    3     3
     3   8.3333333333333332E-03    5     5
     4  -1.9841269841269841E-04    7     7
     5   2.7557319223985893E-06    9     9

cos(x)    :  R, NV =   1, MO = 10
 *******************************************************
     I   COEFFICIENT             ORDER   EXPONENTS
     1   1.0000000000000000E+00    0     0
     2  -5.0000000000000000E-01    2     2
     3   4.1666666666666664E-02    4     4
     4  -1.3888888888888887E-03    6     6
     5   2.4801587301587298E-05    8     8
     6  -2.7557319223985888E-07   10     10

tan(x)    :  R, NV =   1, MO = 10
 *******************************************************
     I   COEFFICIENT             ORDER   EXPONENTS
     1   1.0000000000000000E+00    1     1
     2   3.3333333333333337E-01    3     3
     3   1.3333333333333336E-01    5     5
     4   5.3968253968253971E-02    7     7
     5   2.1869488536155206E-02    9     9
ldeniau commented 6 months ago

The tracking 4th order damap with 6 variables through the LHC was already working, so I don't think that any functions like sin, cos, etc. that are extensively used in elements maps are wrong, idem for operator +, -, *, inv. The map inversion and the Lie operators are only used in normal forms and other high order analysis (e.g. Lie log), so unless I missed something, the remaining bug is probably in some basic manipulation function like cycle or equivalent...

ldeniau commented 6 months ago

bug in seti fixed in 86662fb4 Last commit with some cleanup and factorisation works fine for RDTs along the LHC. As expected, this version is faster than all others by 5-10% on my laptop. Next step is to remove d->to everywhere (not tread-safe) and use mo vs ao for truncation order (thread-safe).

mattsignorelli commented 6 months ago

Sorry for the delay, I've been sidetracked with other projects. Tomorrow I will compile the latest dev-tpsa-newlazy, run all of my tests and let you know

ldeniau commented 6 months ago

Don't worry, I have fixed all the problems, and now I am optimising the "density" of the tpsa. However, running your test suite can't harm...

mattsignorelli commented 6 months ago

OK all my tests pass! Only minor thing I found was that, for DESC_USE_TMP=1, the #include "mad_tpsa.h" and #include "mad_ctpsa.h" in mad_desc_impl.h had to be moved to the top of the file below the other includes, or else the compiler doesn't know what the ctpsa_t type is in the descriptor struct definition

ldeniau commented 6 months ago

last point fixed in 49f288b3 The lib is now completed and can be tested in depth. The performances are as good if not better (by 10%) than the standard version on my laptop.