entity-toolkit / entity

New generation astrophysical plasma simulation code with CPU/GPU portability
https://entity-toolkit.github.io/wiki/
Other
27 stars 2 forks source link

1D GRPIC #59

Open StaticObserver opened 1 month ago

StaticObserver commented 1 month ago

I finished tests of new metric, they all passed

haykh commented 1 month ago

@StaticObserver nice! i changed the merge branch to 1.1.0rc. we try to never directly merge to master.

haykh commented 1 month ago

@StaticObserver i rebased the branch so it's now up-to-date with 1.1.0rc. Only changes relevant to the new metric are visible now.

haykh commented 1 month ago

METRICS::fs test currently fails due to likely accuracy issues. numbers are close, but not close enough for the cmp::AlmostEqual. @StaticObserver could you fix and push?

StaticObserver commented 1 month ago

The tests all passed on HIP device and failed on Cuda device. I carefully checked the expressions of f0 and find no errors, I simply changed the accuracy requirement from 10 to 30, and they all should pass now.

StaticObserver commented 1 month ago

@StaticObserver nice! i changed the merge branch to 1.1.0rc. we try to never directly merge to master.

Sorry, I just found where to change the branch to merge

haykh commented 1 month ago

The tests all passed on HIP device and failed on Cuda device. I carefully checked the expressions of f0 and find no errors, I simply changed the accuracy requirement from 10 to 30, and they all should pass now.

you mean in GH actions? it actually doesn't run the AMD, which is why the circle is orange. the nvidia ones are ran automatically on my computer whenever i log in (github doesn't provide GPUs for free). i haven't set up the HIP container for gh actions yet, that one will have to run on my laptop.

if you click details on the failed test and scroll down, you should see where it fails. in this case, for instance, it fails at test 11 (double precision):

test 11
      Start 11: METRICS::coord_trans

11: Test command: /home/runner/actions-runner/_work/entity/entity/build/metrics/tests/test-metrics-coord_trans.xc
11: Working Directory: /home/runner/actions-runner/_work/entity/entity/build/metrics/tests
11: Test timeout computed to be: 10000000
11: 0 : 5.000000000000e-01 != 5.000000000000e-01 code->phys not invertible
11: 0 : 1.500000000000e+00 != 1.500000000000e+00 code->phys not invertible
11: 0 : 2.500000000000e+00 != 2.500000000000e+00 code->phys not invertible
11: 0 : 4.500000000000e+00 != 4.500000000000e+00 code->phys not invertible
11: 0 : 5.500000000000e+00 != 5.500000000000e+00 code->phys not invertible
11: 0 : 6.500000000000e+00 != 6.500000000000e+00 code->phys not invertible
11: 0 : 7.500000000000e+00 != 7.500000000000e+00 code->phys not invertible
11: 0 : 9.500000000000e+00 != 9.500000000000e+00 code->phys not invertible
11: 0 : 1.950000000000e+01 != 1.950000000000e+01 code->phys not invertible
11: 0 : 2.050000000000e+01 != 2.050000000000e+01 code->phys not invertible
11: code-sph-phys for 1D flux_surface failed with 10 errors
11/34 Test #11: METRICS::coord_trans .............***Failed    0.27 sec
0 : 5.000000000000e-01 != 5.000000000000e-01 code->phys not invertible
0 : 1.500000000000e+00 != 1.500000000000e+00 code->phys not invertible
0 : 2.500000000000e+00 != 2.500000000000e+00 code->phys not invertible
0 : 4.500000000000e+00 != 4.500000000000e+00 code->phys not invertible
0 : 5.500000000000e+00 != 5.500000000000e+00 code->phys not invertible
0 : 6.500000000000e+00 != 6.500000000000e+00 code->phys not invertible
0 : 7.500000000000e+00 != 7.500000000000e+00 code->phys not invertible
0 : 9.500000000000e+00 != 9.500000000000e+00 code->phys not invertible
0 : 1.950000000000e+01 != 1.950000000000e+01 code->phys not invertible
0 : 2.050000000000e+01 != 2.050000000000e+01 code->phys not invertible
code-sph-phys for 1D flux_surface failed with 10 errors

from the looks of it, it seems the issue is in floating point comparison. unfortunately, this is a very sensitive operation, and often depends on the architecture/compiler etc. so in these tests, i usually try to compare numbers with a large-enough margin of error. i.e., cmp::AlmostEqual(a, b, numeri_limits::epsilon * 1000) or something like that.

haykh commented 1 month ago

Single precision tests also fail at test 11:

test 11
      Start 11: METRICS::coord_trans

11: Test command: /home/runner/actions-runner/_work/entity/entity/build/metrics/tests/test-metrics-coord_trans.xc
11: Working Directory: /home/runner/actions-runner/_work/entity/entity/build/metrics/tests
11: Test timeout computed to be: 10000000
11: 0 : 5.000000000000e-01 != 4.999967813492e-01 code->phys not invertible
11: 0 : 1.500000000000e+00 != 1.499975204468e+00 code->phys not invertible
11: 0 : 3.500000000000e+00 != 3.499992370605e+00 code->phys not invertible
11: 0 : 4.500000000000e+00 != 4.499970912933e+00 code->phys not invertible
11: 0 : 5.500000000000e+00 != 5.499964237213e+00 code->phys not invertible
11: 0 : 6.500000000000e+00 != 6.500025749207e+00 code->phys not invertible
11: 0 : 8.500000000000e+00 != 8.500012397766e+00 code->phys not invertible
11: 0 : 1.050000000000e+01 != 1.049997711182e+01 code->phys not invertible
11: 0 : 1.150000000000e+01 != 1.150001621246e+01 code->phys not invertible
11: 0 : 1.250000000000e+01 != 1.249997901917e+01 code->phys not invertible
11: code-sph-phys for 1D flux_surface failed with 10 errors
11/34 Test #11: METRICS::coord_trans .............***Failed    0.23 sec
0 : 5.000000000000e-01 != 4.999967813492e-01 code->phys not invertible
0 : 1.500000000000e+00 != 1.499975204468e+00 code->phys not invertible
0 : 3.500000000000e+00 != 3.499992370605e+00 code->phys not invertible
0 : 4.500000000000e+00 != 4.499970912933e+00 code->phys not invertible
0 : 5.500000000000e+00 != 5.499964237213e+00 code->phys not invertible
0 : 6.500000000000e+00 != 6.500025749207e+00 code->phys not invertible
0 : 8.500000000000e+00 != 8.500012397766e+00 code->phys not invertible
0 : 1.050000000000e+01 != 1.049997711182e+01 code->phys not invertible
0 : 1.150000000000e+01 != 1.150001621246e+01 code->phys not invertible
0 : 1.250000000000e+01 != 1.249997901917e+01 code->phys not invertible
code-sph-phys for 1D flux_surface failed with 10 errors

and here the difference is more prominent, but again likely due to truncation error.

StaticObserver commented 1 month ago

I tried my best to reduce the truncation error.

haykh commented 1 month ago
    Inline auto eta2r(const real_t& eta) const -> real_t{
      real_t diff = TWO * math::sqrt(ONE - SQR(a));
      real_t exp_m1 = std::expm1(eta * diff);
      return rh_m - diff / exp_m1;
    }

you cannot use std:: functions, as they don't have CUDA/HIP counterparts. you have to use functions defined by Kokkos, which are architecture-portable: https://kokkos.org/kokkos-core-wiki/API/core/numerics/mathematical-functions.html

so std::expm1 has to be math::expm1 where math:: namespace is an alias of Kokkos::.

StaticObserver commented 1 month ago

I see, but they worked well, i guess it was just not efficient?

haykh commented 1 month ago

@StaticObserver all your pushes will be published here automatically. no need for additional PR.

StaticObserver commented 1 month ago

@StaticObserver all your pushes will be published here automatically. no need for additional PR.

I see, thanks for reminding.