Closed zhangchonglin closed 7 months ago
@zhangchonglin Thanks for the detailed report; especially running the older versions. A few comments:
pseudoXGCm_120kElms
and pseudoPushAndSearch_t1
is concerning. @cwsmith: thanks for these comments. I agree pseudoXGCm_120kElms
and pseudoPushAndSearch_t1
could be a focus since these two tests use major parts of PUMIPic
and use single GPU. And Cabana
should not matter here since it's essentially not used in above tests.
Running git bisect pointed at this commit for the performance drop in pseudoXGCm_120kElms
and pseudoPushAndSearch_t1
:
c17b75a9a5fde6ba815bfe68b9fac2adc64054d5 is the first bad commit
commit c17b75a9a5fde6ba815bfe68b9fac2adc64054d5
Author: Angelyr <scardking@gmail.com>
Date: Mon Nov 20 18:34:48 2023 -0500
fixed sigma = INT_MAX
particle_structs/src/scs/SCS_sort.h | 1 +
particle_structs/test/test_structure.cpp | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
This is interesting. There is effectively only a one line change in this commit:
sigma = std::min(sigma, std::max(num_elems, 1));
Yeah. I've reverted the commit locally and retesting on perlmutter ~.... but perlmutter or the node I'm testing on seems to be acting strangely at the moment and all my runs are 'slow'. (I switched nodes and that resolved it)~ and the results look good.
@zhangchonglin Would you mind trying on your machine? This branch https://github.com/SCOREC/pumi-pic/tree/cws/perfTests has the commit reverted.
Here are the current test results on an NVIDIA 3060 using https://github.com/SCOREC/pumi-pic/tree/cws/perfTests:
$ cat Testing/Temporary/CTestCostData.txt
viewComm_1 1 0.160911
viewComm_2 1 0.132697
viewComm_4 1 0.220739
type_test 1 0.119557
sort_test 1 0.0989485
scanTest 1 0.0974365
view_test 1 0.0997208
initParticles 1 0.10338
buildSCS 1 0.0947061
scs_padding 1 0.119062
lambdaTest 1 0.113555
write_ptcl_small 1 0.0877834
write_ptcl_small_4 1 0.164311
write_ptcl_4 1 0.185688
write_ptcl_empty 1 0.149241
write_ptcl_noptcls 1 0.145234
write_ptcl_medium 1 0.117534
write_ptcl_large 1 0.469619
test_structures_small 1 0.17026
test_structures_medium 1 0.269958
test_structures_large 1 1.50534
test_structures_small_4 1 1.56285
test_structures_4 1 1.71577
test_structures_empty 1 0.798312
test_structures_noptcls 1 0.837134
destroy_test 1 0.402936
barycentric_3 1 0.166155
test_adj_2d 1 0.659199
test_adj_3d 1 1.7997
search2d 1 0.204503
print_partition_cube_2 1 0.263603
ptn_loading_cube 1 0.414934
print_partition_cube_4 1 0.456839
ptn_loading_cube_4 1 0.416852
print_partition_pisces_4 1 0.442497
ptn_loading_pisces 1 0.427233
print_partition_2d_box_4 1 0.374372
ptn_loading_2d_box_4 1 0.429444
full_mesh_pisces 1 0.687399
input_construct_cube 1 0.649281
comm_array_pisces 1 0.978467
comm_array_2d_box 1 0.615361
file_rw_cube_4 1 0.591496
file_rw_xgc_24k_1 1 0.136416
file_rw_xgc_24k_4 1 0.3849
file_rw_xgc_120k_1 1 0.300448
file_rw_xgc_120k_4 1 0.611995
lb_r1 1 0.145082
lb_r4 1 0.533143
pseudoPushAndSearch_t1 1 0.266662
pseudoPushAndSearch_t2_r2 1 0.81152
pseudoPushAndSearch_cube_t1 1 0.291881
pseudoXGCm_scatter 1 0.176611
pseudoXGCm_24kElms 1 0.338749
pseudoXGCm_24kElms_4 1 3.69858
pseudoXGCm_120kElms 1 0.453264
pseudoXGCm_120kElms_4 1 1.77536
@cwsmith: The above commit seems to be the cause. Do you know why Angel made that change? The timing seems reasonable. I will test XGCm
on Summit
to see if it's the same case.
This is the test result using kokkos 3.7.02
with old PUMIPic
commit d6a53c5.
Start 1: viewComm_1
1/52 Test #1: viewComm_1 ....................... Passed 0.19 sec
Start 2: viewComm_2
2/52 Test #2: viewComm_2 ....................... Passed 0.17 sec
Start 3: viewComm_4
3/52 Test #3: viewComm_4 ....................... Passed 0.22 sec
Start 4: type_test
4/52 Test #4: type_test ........................ Passed 0.14 sec
Start 5: view_test
5/52 Test #5: view_test ........................ Passed 0.10 sec
Start 6: initParticles
6/52 Test #6: initParticles .................... Passed 0.14 sec
Start 7: buildSCS
7/52 Test #7: buildSCS ......................... Passed 0.11 sec
Start 8: scs_padding
8/52 Test #8: scs_padding ...................... Passed 0.12 sec
Start 9: lambdaTest
9/52 Test #9: lambdaTest ....................... Passed 0.13 sec
Start 10: write_ptcl_small
10/52 Test #10: write_ptcl_small ................. Passed 0.09 sec
Start 11: write_ptcl_small_4
11/52 Test #11: write_ptcl_small_4 ............... Passed 0.17 sec
Start 12: write_ptcl_4
12/52 Test #12: write_ptcl_4 ..................... Passed 0.17 sec
Start 13: write_ptcl_empty
13/52 Test #13: write_ptcl_empty ................. Passed 0.17 sec
Start 14: write_ptcl_noptcls
14/52 Test #14: write_ptcl_noptcls ............... Passed 0.17 sec
Start 15: write_ptcl_medium
15/52 Test #15: write_ptcl_medium ................ Passed 0.13 sec
Start 16: write_ptcl_large
16/52 Test #16: write_ptcl_large ................. Passed 0.60 sec
Start 17: test_structures_small
17/52 Test #17: test_structures_small ............ Passed 0.14 sec
Start 18: test_structures_medium
18/52 Test #18: test_structures_medium ........... Passed 0.23 sec
Start 19: test_structures_large
19/52 Test #19: test_structures_large ............ Passed 1.13 sec
Start 20: test_structures_small_4
20/52 Test #20: test_structures_small_4 .......... Passed 0.81 sec
Start 21: test_structures_4
21/52 Test #21: test_structures_4 ................ Passed 0.86 sec
Start 22: test_structures_empty
22/52 Test #22: test_structures_empty ............ Passed 0.42 sec
Start 23: test_structures_noptcls
23/52 Test #23: test_structures_noptcls .......... Passed 0.46 sec
Start 24: destroy_test
24/52 Test #24: destroy_test ..................... Passed 0.31 sec
Start 25: barycentric_3
25/52 Test #25: barycentric_3 .................... Passed 0.14 sec
Start 26: test_adj_2d
26/52 Test #26: test_adj_2d ...................... Passed 0.61 sec
Start 27: test_adj_3d
27/52 Test #27: test_adj_3d ...................... Passed 1.44 sec
Start 28: search2d
28/52 Test #28: search2d ......................... Passed 0.18 sec
Start 29: print_partition_cube_2
29/52 Test #29: print_partition_cube_2 ........... Passed 0.27 sec
Start 30: ptn_loading_cube
30/52 Test #30: ptn_loading_cube ................. Passed 0.25 sec
Start 31: print_partition_cube_4
31/52 Test #31: print_partition_cube_4 ........... Passed 0.36 sec
Start 32: ptn_loading_cube_4
32/52 Test #32: ptn_loading_cube_4 ............... Passed 0.33 sec
Start 33: print_partition_pisces_4
33/52 Test #33: print_partition_pisces_4 ......... Passed 0.38 sec
Start 34: ptn_loading_pisces
34/52 Test #34: ptn_loading_pisces ............... Passed 0.36 sec
Start 35: full_mesh_pisces
35/52 Test #35: full_mesh_pisces ................. Passed 0.34 sec
Start 36: input_construct_cube
36/52 Test #36: input_construct_cube ............. Passed 0.40 sec
Start 37: comm_array_pisces
37/52 Test #37: comm_array_pisces ................ Passed 0.57 sec
Start 38: file_rw_cube_4
38/52 Test #38: file_rw_cube_4 ................... Passed 0.45 sec
Start 39: file_rw_xgc_24k_1
39/52 Test #39: file_rw_xgc_24k_1 ................ Passed 0.15 sec
Start 40: file_rw_xgc_24k_4
40/52 Test #40: file_rw_xgc_24k_4 ................ Passed 0.32 sec
Start 41: file_rw_xgc_120k_1
41/52 Test #41: file_rw_xgc_120k_1 ............... Passed 0.30 sec
Start 42: file_rw_xgc_120k_4
42/52 Test #42: file_rw_xgc_120k_4 ............... Passed 0.46 sec
Start 43: lb_r1
43/52 Test #43: lb_r1 ............................ Passed 0.13 sec
Start 44: lb_r4
44/52 Test #44: lb_r4 ............................ Passed 0.43 sec
Start 45: pseudoPushAndSearch_t1
45/52 Test #45: pseudoPushAndSearch_t1 ........... Passed 0.27 sec
Start 46: pseudoPushAndSearch_t2_r2
46/52 Test #46: pseudoPushAndSearch_t2_r2 ........ Passed 0.69 sec
Start 47: pseudoPushAndSearch_cube_t1
47/52 Test #47: pseudoPushAndSearch_cube_t1 ...... Passed 0.30 sec
Start 48: pseudoXGCm_scatter
48/52 Test #48: pseudoXGCm_scatter ............... Passed 0.15 sec
Start 49: pseudoXGCm_24kElms
49/52 Test #49: pseudoXGCm_24kElms ............... Passed 0.32 sec
Start 50: pseudoXGCm_24kElms_4
50/52 Test #50: pseudoXGCm_24kElms_4 ............. Passed 2.59 sec
Start 51: pseudoXGCm_120kElms
51/52 Test #51: pseudoXGCm_120kElms .............. Passed 0.35 sec
Start 52: pseudoXGCm_120kElms_4
52/52 Test #52: pseudoXGCm_120kElms_4 ............ Passed 1.24 sec
This is the test result using kokkos 4.2.00
with newest PUMIPic
commit 7b55b1b plus reverting the problematic commit.
Start 1: viewComm_1
1/57 Test #1: viewComm_1 ....................... Passed 0.14 sec
Start 2: viewComm_2
2/57 Test #2: viewComm_2 ....................... Passed 0.15 sec
Start 3: viewComm_4
3/57 Test #3: viewComm_4 ....................... Passed 0.22 sec
Start 4: type_test
4/57 Test #4: type_test ........................ Passed 0.12 sec
Start 5: sort_test
5/57 Test #5: sort_test ........................ Passed 0.11 sec
Start 6: scanTest
6/57 Test #6: scanTest ......................... Passed 0.14 sec
Start 7: view_test
7/57 Test #7: view_test ........................ Passed 0.09 sec
Start 8: initParticles
8/57 Test #8: initParticles .................... Passed 0.10 sec
Start 9: buildSCS
9/57 Test #9: buildSCS ......................... Passed 0.12 sec
Start 10: scs_padding
10/57 Test #10: scs_padding ...................... Passed 0.12 sec
Start 11: lambdaTest
11/57 Test #11: lambdaTest ....................... Passed 0.10 sec
Start 12: write_ptcl_small
12/57 Test #12: write_ptcl_small ................. Passed 0.14 sec
Start 13: write_ptcl_small_4
13/57 Test #13: write_ptcl_small_4 ............... Passed 0.20 sec
Start 14: write_ptcl_4
14/57 Test #14: write_ptcl_4 ..................... Passed 0.20 sec
Start 15: write_ptcl_empty
15/57 Test #15: write_ptcl_empty ................. Passed 0.18 sec
Start 16: write_ptcl_noptcls
16/57 Test #16: write_ptcl_noptcls ............... Passed 0.20 sec
Start 17: write_ptcl_medium
17/57 Test #17: write_ptcl_medium ................ Passed 0.15 sec
Start 18: write_ptcl_large
18/57 Test #18: write_ptcl_large ................. Passed 0.61 sec
Start 19: test_structures_small
19/57 Test #19: test_structures_small ............ Passed 0.11 sec
Start 20: test_structures_medium
20/57 Test #20: test_structures_medium ........... Passed 0.21 sec
Start 21: test_structures_large
21/57 Test #21: test_structures_large ............ Passed 1.12 sec
Start 22: test_structures_small_4
22/57 Test #22: test_structures_small_4 .......... Passed 0.83 sec
Start 23: test_structures_4
23/57 Test #23: test_structures_4 ................ Passed 0.90 sec
Start 24: test_structures_empty
24/57 Test #24: test_structures_empty ............ Passed 0.46 sec
Start 25: test_structures_noptcls
25/57 Test #25: test_structures_noptcls .......... Passed 0.44 sec
Start 26: destroy_test
26/57 Test #26: destroy_test ..................... Passed 0.30 sec
Start 27: barycentric_3
27/57 Test #27: barycentric_3 .................... Passed 0.14 sec
Start 28: test_adj_2d
28/57 Test #28: test_adj_2d ...................... Passed 0.53 sec
Start 29: test_adj_3d
29/57 Test #29: test_adj_3d ...................... Passed 1.44 sec
Start 30: search2d
30/57 Test #30: search2d ......................... Passed 0.18 sec
Start 31: print_partition_cube_2
31/57 Test #31: print_partition_cube_2 ........... Passed 0.24 sec
Start 32: ptn_loading_cube
32/57 Test #32: ptn_loading_cube ................. Passed 0.22 sec
Start 33: print_partition_cube_4
33/57 Test #33: print_partition_cube_4 ........... Passed 0.39 sec
Start 34: ptn_loading_cube_4
34/57 Test #34: ptn_loading_cube_4 ............... Passed 0.42 sec
Start 35: print_partition_pisces_4
35/57 Test #35: print_partition_pisces_4 ......... Passed 0.42 sec
Start 36: ptn_loading_pisces
36/57 Test #36: ptn_loading_pisces ............... Passed 0.39 sec
Start 37: print_partition_2d_box_4
37/57 Test #37: print_partition_2d_box_4 ......... Passed 0.37 sec
Start 38: ptn_loading_2d_box_4
38/57 Test #38: ptn_loading_2d_box_4 ............. Passed 0.34 sec
Start 39: full_mesh_pisces
39/57 Test #39: full_mesh_pisces ................. Passed 0.37 sec
Start 40: input_construct_cube
40/57 Test #40: input_construct_cube ............. Passed 0.46 sec
Start 41: comm_array_pisces
41/57 Test #41: comm_array_pisces ................ Passed 0.72 sec
Start 42: comm_array_2d_box
42/57 Test #42: comm_array_2d_box ................ Passed 0.53 sec
Start 43: file_rw_cube_4
43/57 Test #43: file_rw_cube_4 ................... Passed 0.49 sec
Start 44: file_rw_xgc_24k_1
44/57 Test #44: file_rw_xgc_24k_1 ................ Passed 0.17 sec
Start 45: file_rw_xgc_24k_4
45/57 Test #45: file_rw_xgc_24k_4 ................ Passed 0.37 sec
Start 46: file_rw_xgc_120k_1
46/57 Test #46: file_rw_xgc_120k_1 ............... Passed 0.30 sec
Start 47: file_rw_xgc_120k_4
47/57 Test #47: file_rw_xgc_120k_4 ............... Passed 0.51 sec
Start 48: lb_r1
48/57 Test #48: lb_r1 ............................ Passed 0.14 sec
Start 49: lb_r4
49/57 Test #49: lb_r4 ............................ Passed 0.48 sec
Start 50: pseudoPushAndSearch_t1
50/57 Test #50: pseudoPushAndSearch_t1 ........... Passed 0.28 sec
Start 51: pseudoPushAndSearch_t2_r2
51/57 Test #51: pseudoPushAndSearch_t2_r2 ........ Passed 0.70 sec
Start 52: pseudoPushAndSearch_cube_t1
52/57 Test #52: pseudoPushAndSearch_cube_t1 ...... Passed 0.28 sec
Start 53: pseudoXGCm_scatter
53/57 Test #53: pseudoXGCm_scatter ............... Passed 0.14 sec
Start 54: pseudoXGCm_24kElms
54/57 Test #54: pseudoXGCm_24kElms ............... Passed 0.30 sec
Start 55: pseudoXGCm_24kElms_4
55/57 Test #55: pseudoXGCm_24kElms_4 ............. Passed 2.93 sec
Start 56: pseudoXGCm_120kElms
56/57 Test #56: pseudoXGCm_120kElms .............. Passed 0.42 sec
Start 57: pseudoXGCm_120kElms_4
57/57 Test #57: pseudoXGCm_120kElms_4 ............ Passed 1.39 sec
Great. Thanks for testing.
IIRC, there were test failures, or a memory leak, that the change was addressing.
@cwsmith This commit was to fix a performance issue that Dyhan noticed. The problem is that if sigma was greater than num_elems then no sorting happens.
On commit c17b75a9a5fde6ba815bfe68b9fac2adc64054d5 you probably should have used std::numeric_limits<lid_t>::max
because the current code will break if you change the type of lid_t
.
Since sigma=INT_MAX
won't the following line always be num_elems
since num_elems
will always be less than INT_MAX
?
@jacobmerson the line sigma=INT_MAX
only affects a few of our tests and is not in our source code. I will start looking into this issue today.
@cwsmith I have been testing the code and I have found some issues and some solutions. I want to hear your thoughts.
You can read this file for reference: https://github.com/SCOREC/pumi-pic/blob/ac/thrust-sort/particle_structs/test/sortTest.cpp
Findings:
When I was reading the documentation I found this line that made the kokkos sort-by-key significantly faster (15x). Which is now 1s at 1M elements:
int vectorLen = PolicyType::vector_length_max();
Could you explain why this helps and is there a way to improve it more? Here is the docs for reference: https://kokkos.org/kokkos-core-wiki/API/algorithms/Sort.html
However, it is still slower than the thrust sort-by-key which is .0005s at 1M elements.
@Angelyr: thank you for investigating this. This is a good discovery in that:
PUMIPic
.PUMIPic
when they develop their own physics code, and let computer scientist like you to handle computer science related stuffs. Otherwise, there might be performance issues they are not even aware of.This sounds like we can also let Kokkos
developer aware of the issue, so they can address this issue from their side as well (aside from you addressing the issue in PUMIPic
)?
@zhangchonglin I have a change that should resolve the issue on this branch. Feel free to test if you have time:
ac/thrust-sort
Thanks Angel! Will give it a try later!
A simple test shows that with your new branch, the time cost is on par with old code. Only pseudoXGCm_24kElms_4
test is about 10-15%
slower. Need to test using XGCm
with more particles to get reliable results.
Start 50: pseudoPushAndSearch_t1
50/57 Test #50: pseudoPushAndSearch_t1 ........... Passed 0.27 sec
Start 51: pseudoPushAndSearch_t2_r2
51/57 Test #51: pseudoPushAndSearch_t2_r2 ........ Passed 0.89 sec
Start 52: pseudoPushAndSearch_cube_t1
52/57 Test #52: pseudoPushAndSearch_cube_t1 ...... Passed 0.32 sec
Start 53: pseudoXGCm_scatter
53/57 Test #53: pseudoXGCm_scatter ............... Passed 0.14 sec
Start 54: pseudoXGCm_24kElms
54/57 Test #54: pseudoXGCm_24kElms ............... Passed 0.34 sec
Start 55: pseudoXGCm_24kElms_4
55/57 Test #55: pseudoXGCm_24kElms_4 ............. Passed 3.92 sec
Start 56: pseudoXGCm_120kElms
56/57 Test #56: pseudoXGCm_120kElms .............. Passed 0.40 sec
Start 57: pseudoXGCm_120kElms_4
57/57 Test #57: pseudoXGCm_120kElms_4 ............ Passed 1.68 sec
@Angelyr: with your fix, XGCm
time cost is also consistent with kokkos 3.7.02
and earlier PUMIPic
dated around June 2023. Thanks for fixing the issue.
Comet
withkokkos 4.2.00
,cuda 11.7
,gcc 11.4.1
, and latestPUMIPic
andomega_h
versions, I observed a significant slowdown (at least 3-4 times) in the total time cost, compared tokokkos 3.7.02
,cuda 11.7
,gcc 11.4.1
, and earlierPUMIPic
andomega_h
versions dated back to June 2023.PUMIPic
tests and found similar behavior.PUMIPic
issue, but ratherkokkos
oromega_h
issue. I haven't had the time to trace back one by one.Nvidia A2000
GPU.SCS
particle structure is used.test time using
kokkos 3.7.02
Code versions used:
test time using
kokkos 4.2.00
Code versions used:
Configure and build script of
PUMIPic
:Below are the complete log from the tests
test time using
kokkos 3.7.02
100% tests passed, 0 tests failed out of 52
Total Test time (real) = 30.23 sec
1/57 Test #1: viewComm_1 ....................... Passed 0.16 sec Start 2: viewComm_2 2/57 Test #2: viewComm_2 ....................... Passed 0.18 sec Start 3: viewComm_4 3/57 Test #3: viewComm_4 ....................... Passed 0.26 sec Start 4: type_test 4/57 Test #4: type_test ........................ Passed 0.15 sec Start 5: sort_test 5/57 Test #5: sort_test ........................ Passed 0.13 sec Start 6: scanTest 6/57 Test #6: scanTest ......................... Passed 0.13 sec Start 7: view_test 7/57 Test #7: view_test ........................ Passed 0.09 sec Start 8: initParticles 8/57 Test #8: initParticles .................... Passed 0.15 sec Start 9: buildSCS 9/57 Test #9: buildSCS ......................... Passed 0.17 sec Start 10: scs_padding 10/57 Test #10: scs_padding ...................... Passed 0.25 sec Start 11: lambdaTest 11/57 Test #11: lambdaTest ....................... Passed 0.15 sec Start 12: write_ptcl_small 12/57 Test #12: write_ptcl_small ................. Passed 0.16 sec Start 13: write_ptcl_small_4 13/57 Test #13: write_ptcl_small_4 ............... Passed 0.20 sec Start 14: write_ptcl_4 14/57 Test #14: write_ptcl_4 ..................... Passed 0.22 sec Start 15: write_ptcl_empty 15/57 Test #15: write_ptcl_empty ................. Passed 0.18 sec Start 16: write_ptcl_noptcls 16/57 Test #16: write_ptcl_noptcls ............... Passed 0.18 sec Start 17: write_ptcl_medium 17/57 Test #17: write_ptcl_medium ................ Passed 0.14 sec Start 18: write_ptcl_large 18/57 Test #18: write_ptcl_large ................. Passed 0.58 sec Start 19: test_structures_small 19/57 Test #19: test_structures_small ............ Passed 0.32 sec Start 20: test_structures_medium 20/57 Test #20: test_structures_medium ........... Passed 0.32 sec Start 21: test_structures_large 21/57 Test #21: test_structures_large ............ Passed 1.44 sec Start 22: test_structures_small_4 22/57 Test #22: test_structures_small_4 .......... Passed 1.02 sec Start 23: test_structures_4 23/57 Test #23: test_structures_4 ................ Passed 1.20 sec Start 24: test_structures_empty 24/57 Test #24: test_structures_empty ............ Passed 0.58 sec Start 25: test_structures_noptcls 25/57 Test #25: test_structures_noptcls .......... Passed 0.61 sec Start 26: destroy_test 26/57 Test #26: destroy_test ..................... Passed 1.52 sec Start 27: barycentric_3 27/57 Test #27: barycentric_3 .................... Passed 0.14 sec Start 28: test_adj_2d 28/57 Test #28: test_adj_2d ...................... Passed 0.66 sec Start 29: test_adj_3d 29/57 Test #29: test_adj_3d ...................... Passed 2.01 sec Start 30: search2d 30/57 Test #30: search2d ......................... Passed 0.45 sec Start 31: print_partition_cube_2 31/57 Test #31: print_partition_cube_2 ........... Passed 0.62 sec Start 32: ptn_loading_cube 32/57 Test #32: ptn_loading_cube ................. Passed 0.28 sec Start 33: print_partition_cube_4 33/57 Test #33: print_partition_cube_4 ........... Passed 0.46 sec Start 34: ptn_loading_cube_4 34/57 Test #34: ptn_loading_cube_4 ............... Passed 0.48 sec Start 35: print_partition_pisces_4 35/57 Test #35: print_partition_pisces_4 ......... Passed 0.51 sec Start 36: ptn_loading_pisces 36/57 Test #36: ptn_loading_pisces ............... Passed 0.49 sec Start 37: print_partition_2d_box_4 37/57 Test #37: print_partition_2d_box_4 ......... Passed 0.43 sec Start 38: ptn_loading_2d_box_4 38/57 Test #38: ptn_loading_2d_box_4 ............. Passed 0.41 sec Start 39: full_mesh_pisces 39/57 Test #39: full_mesh_pisces ................. Passed 0.81 sec Start 40: input_construct_cube 40/57 Test #40: input_construct_cube ............. Passed 0.56 sec Start 41: comm_array_pisces 41/57 Test #41: comm_array_pisces ................ Passed 0.90 sec Start 42: comm_array_2d_box 42/57 Test #42: comm_array_2d_box ................ Passed 0.68 sec Start 43: file_rw_cube_4 43/57 Test #43: file_rw_cube_4 ................... Passed 0.61 sec Start 44: file_rw_xgc_24k_1 44/57 Test #44: file_rw_xgc_24k_1 ................ Passed 0.15 sec Start 45: file_rw_xgc_24k_4 45/57 Test #45: file_rw_xgc_24k_4 ................ Passed 0.44 sec Start 46: file_rw_xgc_120k_1 46/57 Test #46: file_rw_xgc_120k_1 ............... Passed 0.28 sec Start 47: file_rw_xgc_120k_4 47/57 Test #47: file_rw_xgc_120k_4 ............... Passed 0.60 sec Start 48: lb_r1 48/57 Test #48: lb_r1 ............................ Passed 5.18 sec Start 49: lb_r4 49/57 Test #49: lb_r4 ............................ Passed 0.68 sec Start 50: pseudoPushAndSearch_t1 50/57 Test #50: pseudoPushAndSearch_t1 ........... Passed 0.78 sec Start 51: pseudoPushAndSearch_t2_r2 51/57 Test #51: pseudoPushAndSearch_t2_r2 ........ Passed 2.05 sec Start 52: pseudoPushAndSearch_cube_t1 52/57 Test #52: pseudoPushAndSearch_cube_t1 ...... Passed 0.90 sec Start 53: pseudoXGCm_scatter 53/57 Test #53: pseudoXGCm_scatter ............... Passed 0.14 sec Start 54: pseudoXGCm_24kElms 54/57 Test #54: pseudoXGCm_24kElms ............... Passed 15.10 sec Start 55: pseudoXGCm_24kElms_4 55/57 Test #55: pseudoXGCm_24kElms_4 ............. Passed 6.49 sec Start 56: pseudoXGCm_120kElms 56/57 Test #56: pseudoXGCm_120kElms .............. Passed 5.36 sec Start 57: pseudoXGCm_120kElms_4 57/57 Test #57: pseudoXGCm_120kElms_4 ............ Passed 16.31 sec
100% tests passed, 0 tests failed out of 57
Total Test time (real) = 75.46 sec