Open tbjoss opened 3 years ago
Would it be possible for you to generate a memory usage report with heaptrack (https://github.com/KDE/heaptrack)? Heaptrack is a KDE application and is usually available among the packages of many distributions. Just run "heaptrack application_name", I'm interested in the gz archive generated by heaptrack. To get meaningful results bitpit should be compiled in "RelWithDebInfo" build type.
The heaptrack file for this example (https://www.dropbox.com/s/8ne5wpm5cdt60pg/heaptrack.grid_generator_d3q27.29121.gz?dl=0).
additional information:
this test requires 10GB memory for 6'773'056 cells. To reach the final octree the procedure is:
Thanks for the heaptrack profile.
I briefly looked at the profile and I've seen that you are building both adjacencies and interfaces. Just to double check: does your code need the interfaces? Disabling the interface should allow to save some memory.
I currently use the interfaces to mark hanging vertices and twin vertices (vertices that belong to two grid levels). if there's a way to do this more easily without interfaces, we could get rid of them.
If you need to identify the highlighted nodes, maybe something along the lines of the following pseudo code may work (if needed, next week I can send you some real code):
for cell in cells
for face in cells_faces
nCellAdjacencies = octree.getAdjacencyCount(cell)
if nCellAdjacencies != 1
continue
end if
cell_level = octree.getLevel(cell) // Function getLevel is only available for VolOctree patches
neigh_level = octree.getLevel(octree.getAdjacency(cell))
if cell_level == neigh_level:
continue
end if
ConstProxyVector<long> faceVertexIds = cell.getFaceVertexIds(face);
int nFaceVertices = faceVertexIds.size();
for (int k = 0; k < nFaceVertices; ++k) {
long faceVertexId = faceVertexIds[k];
... Do something with vertex id ...
end for
end for
end for
Pull request #197 removes some unneeded allocations in the VolOctree class, it will not help in reducing the memory usage, but it may speedup a bit the update of the adjacencies.
thank your for your hint. without the interfaces we can observe some performance improvements:
I'm looking at the memory profile, there are two things worth checking that might increase the performances a bit:
In pull #197 I added some more changes to the generation of VolOctree adjacencies, I think the new code should be a bit faster and should also slighlty reduce memory usage (beware we are currently testing the code, it may still have some bugs).
There is also a pull request (#193) that tries to improve how the ProxyVector handles internal storage and another one that should improve the creation of the interfaces (#187), these pull requests should help in removing some allocations I see in the memory profile during the update of the interfaces.
What's the number of cells you would like to handle on your current hardware?
Thank you very much for this information. The flush_data functions are only performed once at the end of the run. While their performance can definitely be improved, they only make up a small amount of the overall run time. I will, however, have a look at the levelset allocations (they are our main time sinks currently). The final number of cells we will have to handle will be in the order of 50-100 million, but that will obviously be done on a machine with significantly more memory. The reduction we got from the removal of the interfaces already helped a lot.
We just discovered that the update of VolOctree mesh relies on cell interfaces to identify the vertices that should be deleted. This was unintended and should be fixed in #201. The branch is still under testing.
Hello Everyone,
I observe peak demand in memory between the "Cells removed: xxx" and "importing new octants". On the current hardware (16GB are available) this limits the octree size to be less than 10 million cells. Is this a normal demand? Further, I used a release version of bitpit to create the octree for this test.