on an unstructured mesh with 60M tetrahedra. Running on 28 CPU cores, 1 rank per core, serial CPU backend, it's taking 26 seconds for a slice. The timings option shows most of the runtime in the extract stage:
Is this an expected runtime or has something gone wrong? By comparison slicing this same mesh in paraview on a laptop takes <1 second.
The runtime takes the same amount of time each iteration, although none of the structure of the conduit data is changing. Is there any caching happening?
The timings for the full mesh extract with no pipeline is taking 8.6 seconds, also quite a bit slower than expected.
If instead of a blueprint extract, I run a render of the mesh, it takes 28.7 seconds, most in the "strip_real_ghosts_ascent_ghosts",
culprit: ascent and VTK-m both default to CMAKE_BUILD_TYPE=Release. conduit has no default CMAKE_BUILD_TYPE, so it was running without optimizations. 26.1s -> 1.27s.
I'm running a single slice extracted to blueprint hdf5,
on an unstructured mesh with 60M tetrahedra. Running on 28 CPU cores, 1 rank per core, serial CPU backend, it's taking 26 seconds for a slice. The timings option shows most of the runtime in the extract stage:
The timings for the full mesh extract with no pipeline is taking 8.6 seconds, also quite a bit slower than expected.
If instead of a blueprint extract, I run a render of the mesh, it takes 28.7 seconds, most in the "strip_real_ghosts_ascent_ghosts",
If I don't provide any ghost information, timing is about the same but now most is in the "s1_p1",
Any idea? Could it have something to do with using local indexing on each rank?