topology-tool-kit / ttk

TTK - Topological Data Analysis and Visualization - Source Code
https://topology-tool-kit.github.io/
Other
415 stars 124 forks source link

Merge Tree Distance Computation "segmentation fault" #1032

Closed peter-hristov closed 4 months ago

peter-hristov commented 4 months ago

Hi,

I'm having some issues with computing merge tree distances. When I load a volumetric data set in the .vti file format with two scalar fields (as point data arrays) in Paraview 5.12.1, then compute the join trees, combine them with the the group data sets filter and then I try to compute the distance between the join trees with the ttkMergeTreeDistanceMatrix filter I get a segfault.

To reproduce the bug start with Paraview 5.12.1 as downloaded from the official website (https://www.paraview.org/download/), load the timeTracking.vti file from ttk-data; compute the join trees of the scalar fields "000" and "002"' with the "TTK Merge and Contour Tree (FTM)" filter; group the two join trees with the "Group Datasets" filter; finally use the filter "TTKMergeTreeDistanceMatrix" with Backend="Wasserstein Distance", Assignment Solver = Auction, Epsilon 1 = 5%, Epsilon Threshold = 0%. When I click apply to that filter Paraveiw crashes and I get a segfault message in the terminal, no other information about the error. Other parameters for TTKMergeTreeDistanceMatrix also produce a segfault.

image

To reproduce the error in C++, you can do the following. Compile VTK v9.3.0 from source (git commit 357d9efeed29cba6383ce626575f1a5f1ac1eefb) and TTK dev (git commit 60d7de008518e2ef5d303e29049513fa022036f8). Note that when I described the issue in Paraview that was for Paraview as download from the Paraview website, not compiled from source with these versions of VTK and TTK.

Use the following CMAKE file.

cmake_minimum_required(VERSION 3.21)
project(ttkExample-vtk-c++ LANGUAGES CXX C)
set(CMAKE_CXX_STANDARD 14)
set(CMAKE_BUILD_TYPE "Release")
find_package(TTKVTK REQUIRED)
add_executable(ttkExample-vtk-c++ main.cpp)
target_link_libraries(ttkExample-vtk-c++ PUBLIC ttk::vtk::ttkAll)

Use the following main.cpp

#include <iostream>

#include <ttkMergeTree.h>
#include <ttkPersistenceDiagram.h>
#include <ttkMergeTreeDistanceMatrix.h>
#include <ttkBlockAggregator.h>

#include <vtkSmartPointer.h>
#include <vtkXMLImageDataReader.h>
#include <vtkImageData.h>
#include <vtkMultiBlockDataSet.h>

using namespace std;

int main(int argc, char **argv) {

    ttk::Debug dbg;
    dbg.setDebugLevel(5);

    std::string inputFilename = "/home/peter/Projects/topoinvis/data/step_00000.vti";

    vtkNew<vtkXMLImageDataReader> reader{};
    reader->SetFileName(inputFilename.data());

    vtkNew<ttkMergeTree> tree00{};
    tree00->SetInputConnection(reader->GetOutputPort());
    tree00->SetInputArrayToProcess(0, 0, 0, vtkDataObject::FIELD_ASSOCIATION_POINTS, "orb00");
    tree00->SetBackend(1);
    tree00->SetTreeType(ttk::ftm::TreeType::Join);
    tree00->Update();

    vtkNew<ttkMergeTree> tree01{};
    tree01->SetInputConnection(reader->GetOutputPort());
    tree01->SetInputArrayToProcess(0, 0, 0, vtkDataObject::FIELD_ASSOCIATION_POINTS, "orb01");
    tree01->SetBackend(0);
    tree01->SetTreeType(ttk::ftm::TreeType::Join);
    tree01->Update();

    //vtkNew<ttkBlockAggregator> blocks{};
    //blocks->AddInputData(tree00->GetOutput());
    //blocks->AddInputData(tree01->GetOutput());
    //blocks->Update();

    cout << "Combining data sets..." << endl;
    vtkNew<vtkMultiBlockDataSet> treeGroup{};
    treeGroup->SetBlock(0, tree00->GetOutput());
    treeGroup->SetBlock(1, tree01->GetOutput());

    vtkNew<ttkMergeTreeDistanceMatrix> distanceMatrix{};
    distanceMatrix->SetInputDataObject(treeGroup);
    distanceMatrix->SetPersistenceThreshold(0.01);
    distanceMatrix->Update();

//# create a new 'TTK MergeTreeClustering'
//tTKMergeTreeClustering1 = TTKMergeTreeClustering(
    //Input=mT_all, OptionalInputclustering=mt_JT_all
//)
}

For the CPP code I also get a segfault when I try to use the default backend for merge tree computation (FTM, not EXTREEM) and I also get a segfault if I try to compute the ttkMergeTreeClustering.

image

I am using Ubuntu 22.04 and GCC 11.4.0. I am using the following configuration for compiling TTK from source. image

Best wishes, Peter Hristov

julien-tierny commented 4 months ago

Hi Peter,

thanks a lot for your issue (I CC @MatPont just in case).

First, in an ideal world, TTK should not segfault but instead return an error message (let's put that on the TODO list).

Second, I suspect there might be a problem in your pipeline setting here. I suggest you consult a working example to see how to put your pipeline together, such as https://topology-tool-kit.github.io/examples/mergeTreeFeatureTracking/

With the dev version of TTK (it seems you are using an older version), here is an example of how to proceed in ParaView:

  1. compute the join tree of "000" (TTKMergeTree1)
  2. compute the joint tree of "0002" (TTKMergeTree2)
  3. group all the nodes together by selecting the outputs "Skeleton Nodes" of TTKMergeTree1 and TTKMergeTree2 in this order (GroupDatasets1)
  4. group all the arcs together by selecting the outputs "Skeleton Arcs" of TTKMergeTree1 and TTKMergeTree2 in this order (GroupDatasets2)
  5. group the selected nodes and arcs (GroupDatasets3)
  6. call "TTK Merge Tree Clustering" on GroupDatasets3

I have just tested this pipeline on the development version of TTK and it does the job.

Thanks for letting know if that works out for you.

Have a great day. Best,

peter-hristov commented 4 months ago

Hi Julien,

Thank you so much for the reply!

  1. compute the join tree of "000" (TTKMergeTree1)
  2. compute the joint tree of "0002" (TTKMergeTree2)
  3. group all the nodes together by selecting the outputs "Skeleton Nodes" of TTKMergeTree1 and TTKMergeTree2 in this order (GroupDatasets1)
  4. group all the arcs together by selecting the outputs "Skeleton Arcs" of TTKMergeTree1 and TTKMergeTree2 in this order (GroupDatasets2)
  5. group the selected nodes and arcs (GroupDatasets3)
  6. call "TTK Merge Tree Clustering" on GroupDatasets3

I have just tested this pipeline on the development version of TTK and it does the job.

Thanks for letting know if that works out for you.

This worked! I had not realised from the clustering or feature tracking example that you have to group the nodes and arcs separately and then group the groups.

I want to extend the merge tree clustering example (https://topology-tool-kit.github.io/examples/mergeTreeClustering/), but I could not get it to run. I've checkout out the latest commit of ttk-data (64c24b6fdc1aad4ad25953dc53781a696f7fbb38) and I've autoloaded the "TopologyToolkit" plug in paravew 5.12.1 (as downloaded from https://www.paraview.org/download/). I'm using Ubuntu 22.04. When I try to run paraview on the state file ~/Applications/ParaView-5.12.1-MPI-Linux-Python3.10-x86_64/bin/paraview ./states/mergeTreeClustering.pvsm I get the following list of errors:

image image

I've tried some of the other examples like ./states/persistentGenerators_at.pvsm and they work just fine.

I also tried running the example with pvpython with the following command ~/Applications/ParaView-5.12.1-MPI-Linux-Python3.10-x86_64/bin/pvpython ./python/mergeTreeClustering.py, but I got the following error:

image

I figured maybe ttk isn't autoloaded for pvpython so I tried adding LoadPlugin("/home/peter/Applications/ParaView-5.12.1-MPI-Linux-Python3.10-x86_64/lib/paraview-5.12/plugins/TopologyToolKit/TopologyToolKit.so") to the top of the script (after from paraview.simple import *), but I'm getting the same error.

In any case, my final goal is to be able to run this pipeline in python (or at least C++) without paraview. I've started with a C++ version because I find it easier to debug.

With the following cmake file:

cmake_minimum_required(VERSION 3.21)
# name of the project
project(ttkExample-vtk-c++ LANGUAGES CXX C)

set(CMAKE_CXX_STANDARD 14)

set(CMAKE_BUILD_TYPE Debug)

find_package(TTKVTK REQUIRED)

add_executable(ttkExample-vtk-c++ main.cpp)

target_link_libraries(ttkExample-vtk-c++
  PUBLIC
    ttk::vtk::ttkAll
    )

and the following main.cpp file:

#include <iostream>

#include <ttkMergeTree.h>
#include <ttkMergeTreeDistanceMatrix.h>
#include <ttkMergeTreeClustering.h>
#include <ttkBlockAggregator.h>

#include <ttkArrayPreconditioning.h>

#include <ttkPersistenceDiagram.h>
#include <ttkPersistenceDiagramClustering.h>

#include <vtkSmartPointer.h>
#include <vtkXMLImageDataReader.h>
#include <vtkImageData.h>
#include <vtkMultiBlockDataSet.h>

using namespace std;

void computeMergeTreeDistances(const string inputFilename)
{
    ttk::Debug dbg;
    dbg.setDebugLevel(5);

    vtkNew<vtkXMLImageDataReader> reader{};
    reader->SetFileName(inputFilename.data());

    cout << "Computing merge tree for the first time step..." << endl;
    vtkNew<ttkMergeTree> tree1{};
    tree1->SetInputConnection(reader->GetOutputPort());
    tree1->SetInputArrayToProcess(0, 0, 0, vtkDataObject::FIELD_ASSOCIATION_POINTS, "000");
    tree1->SetBackend(1);
    tree1->SetTreeType(ttk::ftm::TreeType::Join);
    tree1->Update();

    cout << "Computing merge tree for the second time step..." << endl;
    vtkNew<ttkMergeTree> tree2{};
    tree2->SetInputConnection(reader->GetOutputPort());
    tree2->SetInputArrayToProcess(0, 0, 0, vtkDataObject::FIELD_ASSOCIATION_POINTS, "002");
    tree2->SetBackend(0);
    tree2->SetTreeType(ttk::ftm::TreeType::Join);
    tree2->Update();

    auto nodes1 = tree1->GetOutput(0);
    auto arcs1 = tree1->GetOutput(1);

    auto nodes2 = tree2->GetOutput(0);
    auto arcs2 = tree2->GetOutput(1);

    vtkNew<vtkMultiBlockDataSet> groupNodes{};
    groupNodes->SetBlock(0, nodes1);
    groupNodes->SetBlock(1, nodes2);

    vtkNew<vtkMultiBlockDataSet> groupArcs{};
    groupArcs->SetBlock(0, arcs1);
    groupArcs->SetBlock(1, arcs2);

    vtkNew<vtkMultiBlockDataSet> groupAll{};
    groupAll->SetBlock(0, groupNodes);
    groupAll->SetBlock(1, groupArcs);

    vtkNew<ttkMergeTreeClustering> clustering{};
    clustering->SetInputDataObject(groupAll);
    clustering->Update();
}

int main(int argc, char **argv) 
{
    ttk::Debug dbg;
    dbg.setDebugLevel(5);

    std::string inputFilename = "/home/peter/Projects/data/ttk-data/timeTracking.vti";

    computeMergeTreeDistances(inputFilename);
}

I'm getting a segfault with the following gdb output:

image

Do you have ideas about how I can proceede with this?

julien-tierny commented 4 months ago

Hi Peter,

thanks for your email.

I want to extend the merge tree clustering example (https://topology-tool-kit.github.io/examples/mergeTreeClustering/), but I could not get it to run. I've checkout out the latest commit of ttk-data (64c24b6fdc1aad4ad25953dc53781a696f7fbb38) and I've autoloaded the "TopologyToolkit" plug in paravew 5.12.1 (as downloaded from https://www.paraview.org/download/). I'm using Ubuntu 22.04. When I try to run paraview on the state file ~/Applications/ParaView-5.12.1-MPI-Linux-Python3.10-x86_64/bin/paraview ./states/mergeTreeClustering.pvsm I get the following list of errors: Based on your screenshot, it seems that you are using the TTK plugin built by Kitware, which ships with ParaView. However, this version of the plugin does not include all of TTK's features (for various reasons). For instance, it does not include the filters relying on scikit-learn, which are precisely used by this example. You can see this with the error message "No proxy that matches: group=filters and proxy=ttkDimensionReduction" (ParaView cannot find the filter ttkDimensionReduction because it hasn't been built).

I also tried running the example with pvpython with the following command ~/Applications/ParaView-5.12.1-MPI-Linux-Python3.10-x86_64/bin/pvpython ./python/mergeTreeClustering.py, but I got the following error: It looks like ttkCinemaReader isn't built either in Kitware's build of TTK (error message "NameError: name 'TTKCinemaReader' is not defined").

If you want to use these filters, I'd recommend to use TTK's version of the plugin. We provide packages for various systems or you can simply build it from source (see the website for installation instructions).

In any case, my final goal is to be able to run this pipeline in python If I were you, I'd proceed as follows:

  • Run within ParaView the steps I mentioned in my previous message (steps 1 to 6)
  • Save the state as a python state file

Then you'll be able to run your pipeline directly from pvpython (or even python if you set your environment variables right). The generated script will be verbose but then you'll be able to trim it (this is pretty much how we generate the python code snippets from the example website).

I hope this helps!

(we can continue this discussion on the user mailing list if you want, @.***)

Cheers, -- Dr Julien Tierny CNRS Researcher Sorbonne Universite https://julien-tierny.github.io/

On Friday, May 31, 2024 4:15:22 PM CEST peter_hristov wrote:

Hi Julien,

Thank you so much for the reply!

  1. compute the join tree of "000" (TTKMergeTree1)
  2. compute the joint tree of "0002" (TTKMergeTree2)
  3. group all the nodes together by selecting the outputs "Skeleton Nodes" of TTKMergeTree1 and TTKMergeTree2 in this order (GroupDatasets1)
  4. group all the arcs together by selecting the outputs "Skeleton Arcs" of TTKMergeTree1 and TTKMergeTree2 in this order (GroupDatasets2)
  5. group the selected nodes and arcs (GroupDatasets3)
  6. call "TTK Merge Tree Clustering" on GroupDatasets3

I have just tested this pipeline on the development version of TTK and it does the job.

Thanks for letting know if that works out for you.

This worked! I had not realised from the clustering or feature tracking example that you have to group the nodes and arcs separately and then group the groups.

I want to extend the merge tree clustering example (https://topology-tool-kit.github.io/examples/mergeTreeClustering/), but I could not get it to run. I've checkout out the latest commit of ttk-data (64c24b6fdc1aad4ad25953dc53781a696f7fbb38) and I've autoloaded the "TopologyToolkit" plug in paravew 5.12.1 (as downloaded from https://www.paraview.org/download/). I'm using Ubuntu 22.04. When I try to run paraview on the state file ~/Applications/ParaView-5.12.1-MPI-Linux-Python3.10-x86_64/bin/paraview ./states/mergeTreeClustering.pvsm I get the following list of errors:

image image

I've tried some of the other examples like ./states/persistentGenerators_at.pvsm and they work just fine.

I also tried running the example with pvpython with the following command ~/Applications/ParaView-5.12.1-MPI-Linux-Python3.10-x86_64/bin/pvpython ./python/mergeTreeClustering.py, but I got the following error:

image

I figured maybe ttk isn't autoloaded for pvpython so I tried adding LoadPlugin("/home/peter/Applications/ParaView-5.12.1-MPI-Linux-Python3.10-x86_64/lib/paraview-5.12/plugins/TopologyToolKit/TopologyToolKit.so") to the top of the script (after from paraview.simple import *), but I'm getting the same error.

In any case, my final goal is to be able to run this pipeline in python (or at least C++) without paraview. I've started with a C++ version because I find it easier to debug.

With the following cmake file:

cmake_minimum_required(VERSION 3.21)
# name of the project
project(ttkExample-vtk-c++ LANGUAGES CXX C)

set(CMAKE_CXX_STANDARD 14)

set(CMAKE_BUILD_TYPE Debug)

find_package(TTKVTK REQUIRED)

add_executable(ttkExample-vtk-c++ main.cpp)

target_link_libraries(ttkExample-vtk-c++
  PUBLIC
    ttk::vtk::ttkAll
    )

and the following main.cpp file:

#include <iostream>

#include <ttkMergeTree.h>
#include <ttkMergeTreeDistanceMatrix.h>
#include <ttkMergeTreeClustering.h>
#include <ttkBlockAggregator.h>

#include <ttkArrayPreconditioning.h>

#include <ttkPersistenceDiagram.h>
#include <ttkPersistenceDiagramClustering.h>

#include <vtkSmartPointer.h>
#include <vtkXMLImageDataReader.h>
#include <vtkImageData.h>
#include <vtkMultiBlockDataSet.h>

using namespace std;

void computeMergeTreeDistances(const string inputFilename)
{
    ttk::Debug dbg;
    dbg.setDebugLevel(5);

    vtkNew<vtkXMLImageDataReader> reader{};
    reader->SetFileName(inputFilename.data());

    cout << "Computing merge tree for the first time step..." << endl;
    vtkNew<ttkMergeTree> tree1{};
    tree1->SetInputConnection(reader->GetOutputPort());
    tree1->SetInputArrayToProcess(0, 0, 0, vtkDataObject::FIELD_ASSOCIATION_POINTS, "000");
    tree1->SetBackend(1);
    tree1->SetTreeType(ttk::ftm::TreeType::Join);
    tree1->Update();

    cout << "Computing merge tree for the second time step..." << endl;
    vtkNew<ttkMergeTree> tree2{};
    tree2->SetInputConnection(reader->GetOutputPort());
    tree2->SetInputArrayToProcess(0, 0, 0, vtkDataObject::FIELD_ASSOCIATION_POINTS, "002");
    tree2->SetBackend(0);
    tree2->SetTreeType(ttk::ftm::TreeType::Join);
    tree2->Update();

    auto nodes1 = tree1->GetOutput(0);
    auto arcs1 = tree1->GetOutput(1);

    auto nodes2 = tree2->GetOutput(0);
    auto arcs2 = tree2->GetOutput(1);

    vtkNew<vtkMultiBlockDataSet> groupNodes{};
    groupNodes->SetBlock(0, nodes1);
    groupNodes->SetBlock(1, nodes2);

    vtkNew<vtkMultiBlockDataSet> groupArcs{};
    groupArcs->SetBlock(0, arcs1);
    groupArcs->SetBlock(1, arcs2);

    vtkNew<vtkMultiBlockDataSet> groupAll{};
    groupAll->SetBlock(0, groupNodes);
    groupAll->SetBlock(1, groupArcs);

    vtkNew<ttkMergeTreeClustering> clustering{};
    clustering->SetInputDataObject(groupAll);
    clustering->Update();
}

int main(int argc, char **argv) 
{
    ttk::Debug dbg;
    dbg.setDebugLevel(5);

    std::string inputFilename = "/home/peter/Projects/data/ttk-data/timeTracking.vti";

    computeMergeTreeDistances(inputFilename);
}

I'm getting a segfault with the following gdb output:

image

Do you have ideas about how I can proceede with this?