topology-tool-kit / ttk

TTK - Topological Data Analysis and Visualization - Source Code
https://topology-tool-kit.github.io/
Other
411 stars 121 forks source link

Flexible cell arrays? #484

Open dsakurai opened 3 years ago

dsakurai commented 3 years ago

This is a discussion / feature request.

TTK has a new implementation of setInputCells due to VTK9. Although most application may be unaffected by the new design, the requirements are at least redundant for most non-VTK applications of TTK and increases the RAM usage for somewhat. I wanted to shed light on the topic with this issue.

I am neutral about this issue since it is not a bug. Core TTK devs may be interested in taking time for a flexible approach if that's worth (fixing this issue does not take too much time).

Problems

TTK's new ExplicitTriangluation::setInputCells method assumes the VTK9's new convention, that takes as input LongSimplexId* connectivity and LongSimplexId* offset.

I have used it from a non-VTK application and had a few thoughts that may be relevant to a wider range of third-party app developers:

These arrays are of ttk::LongSimplexId, i.e. long long int, where the rest of TTK assumes ttk::SimplexId, i.e. int.

For desktop applications, and actually for most HPC applications I know, there's no point of using long long int is an overkill. (Edit: strike through words and add italic ones.) It merely increases the RAM usage for four times to store cells. My program copies its own cell array in ttk::SimplexId to a new one in ttk::LongSimplexId just for passing it to TTK's triangulation (edit: this is not critical but redundant).

i'th element of the offset array is always (d+1) x i for d-dimensional triangulated domain (with the exception of the last element)

This is a waste of RAM if the domain is made only of triangles, which is quite often the case for topological analysis even in VTK and ParaView. (Edit: setInputCells apparently assumes this, so this array is probably unnecessary indeed.)

The last element of offset array is unnecessary for TTK Because setInputCells takes an integer input to receive the same information anyway.

Fix?

For example, the data type of connectivity could be templatized to allow ttk::SimplexId or even a custom function. The last element of offset, as well as this array itself, can be made optional.

julien-tierny commented 3 years ago

hi daisuke,

thanks for your message.

TTK has a new implementation of setInputCells due to VTK9. absolutely. @CharlesGueunet authored it.

TTK's new ExplicitTriangluation::setInputCells method assumes the VTK9's new convention, that takes as input LongSimplexId* connectivity and LongSimplexId* offset. absolutely. the reason ttk sticks to vtk9's new convention is to enable a zero-copy access to the connectivity information (with just pointer passing). I agree that for other applications (not based on vtk), that requires to conform to that convention, which may not be optimal for certain usages (in terms of memory footprint) but which has the advantage of being fairly generic.

For desktop applications, and actually for most HPC applications I know, there's no point of using long long int. actually vtk has been using 64bit identifiers by default for some time now (this is why we currently expect this by default as an input).

This is a waste of RAM agreed. I found that the most important issue here was actually the slowdown induced by manipulating 64bit identifiers instead of 32bit identifiers (this is why ttk uses 32bits by default for its internal operations).

that being said, memory footprint does not seem to be a very important issue. only extremely large meshes do not fit in the main memory of most modern commodity computers (and for those, you're likely to need 64bit identifiers).

The last element of offset array is unnecessary for TTK at the moment, ttk assumes the input simplicial complex is homogeneous dimension wise but we hope to extend that this year, to support non manifold simplicial complexes.

The last element of offset, as well as this array itself, can be made optional. there's some interest to have a tailored backend for homogeneous simplicial complexes (reduced memory footprint and easier to use for non-vtk apps) but that would have the downside of having to maintain two implementations.

@CharlesGueunet: what are your thoughts?

cheers, -- Dr Julien Tierny CNRS Researcher Sorbonne Universite http://lip6.fr/Julien.Tierny

On Monday, 7 September 2020 05:41:34 CEST Daisuke Sakurai wrote:

This is a discussion / feature request.

TTK has a new implementation of setInputCells due to VTK9. Although most application may be unaffected by the new design, the requirements are at least redundant for most non-VTK applications of TTK and increases the RAM usage for somewhat. I wanted to shed light on the topic with this issue.

I am neutral about this issue since this is not a bug. Core TTK devs may be interested in taking time for a flexible approach if that's worth.

Problems

TTK's new ExplicitTriangluation::setInputCells method assumes the VTK9's new convention, that takes as input LongSimplexId* connectivity and LongSimplexId* offset.

I am using it from non-VTK application and had a few thoughts that may be relevant to a wider range of third-party app developers:

These arrays are of ttk::LongSimplexId, i.e. long long int, where the rest of TTK assumes ttk::SimplexId, i.e. int.

For desktop applications, and actually for most HPC applications I know, there's no point of using long long int. It merely increases the RAM usage for four times to store cells. My program also had to copy its own cell array in ttk::SimplexId to a new one in ttk::LongSimplexId just for passing it to TTK's triangulation.

i'th element of the offset array is always (d+1) x i for d dimensional triangulated domain (with the exception of the last element)

This is a waste of RAM if the domain is made only of triangles, which is quite often the case for topological analysis even in VTK and ParaView .

The last element of offset array is unnecessary for TTK Because setInputCells takes an integer input to receive the same information anyway.

Fix?

For example, the data type of connectivity could be templatized to allow ttk::SimplexId or even a custom function. The last element of offset, as well as this array itself, can be made optional.