Slicer / Slicer

Multi-platform, free open source software for visualization and image computing.
https://www.slicer.org
Other
1.73k stars 560 forks source link

`FiberBundleFile` and `ModelFile` data type loading may not be working correctly #7363

Closed jhlegarreta closed 1 year ago

jhlegarreta commented 1 year ago

Summary

Following a number of test failures in the SlicerDMRI extension, and after digging a little bit more, it turns out the FiberBundleFile and ModelFile data types are not working properly:

Similar error messages can be found across the failing tests in the 3D Slicer dashboard: https://slicer.cdash.org/viewTest.php?onlyfailed&buildid=3203300

Some thoughts:

However, there is no guarantee that those methods will work, no tests exists for those methods in 3D Slicer proper, and this may not address the underlying issues.

Steps to reproduce

Execute the SlicerDMRI extension tests:

  1. Checkout 3D Slicer.
  2. Build 3D Slicer.
  3. Checkout the extension code.
  4. Build the extension against the built 3D Slicer.
  5. Execute the tests with ctest -V from the inner-build folder of the SlicerDMRI build folder.

Additional information

https://github.com/SlicerDMRI/SlicerDMRI/issues/207 https://github.com/Slicer/Slicer/issues/7342

lassoan commented 1 year ago

This happens at the SampleData.downloadFromURL method, which ultimately relies on SampleDataLogic.downloadFromSource to load the data. This method does not seem to cover the FiberBundleFile and ModelFile cases.

SampleData.downloadFromURL works well for models. For example:

import SampleData
SampleData.downloadFromURL(
    nodeNames="tract1",
    fileNames="tract1.vtk",
    uris="https://github.com/Slicer/slicer.kitware.com-midas3-archive/releases/download/SHA256/06d5b5777915857fbac7b3cbd9c371523d1371f29b0c89eb7a33d86d780d5b2b",
    checksums="SHA256:06d5b5777915857fbac7b3cbd9c371523d1371f29b0c89eb7a33d86d780d5b2b",
    loadFileTypes="ModelFile")

You need to specify the file type, because (as described in the method documentation) by default VolumeFile is used. Slicer has now a method that can determine the most likely file type for a specific file (the IO manager queries all the file reader plugins for that file for a confidence that it is the best reader for that file; and returns the file type corresponding to the most confident reader).

So, instead of hardcoding VolumeFile, we could use filetype = app.coreIOManager().fileType(filename) by default. It would be great if you could implement and test this small change and send a pull request.

Andras mentioned that the https://github.com/Slicer/Slicer/issues/7342#issuecomment-1793967995. However, as said above, the failing methods ultimately rely on 3D Slicer proper methods to load the data, so the failure stems from 3D Slicer proper. Andras reported crashes of the FiberBundleFile data type in 3D Slicer versions as recent as 5.4.0, and 5.5. This matches the dashboard and local tests, where the HEAD (dashboard)/a recent (locally) commit is being used. Also, a search in the extension does not provide meaningful results about the readers being present in the SlicerDMRI module.

Yes, indeed, there is an independent bug in SlicerDMRI that makes Slicer crash on Windows when you want to load a fiber bundle file. It happens when you simply try to load the file locally using "Add data" window. It might be a side effect of VTK updates, or maybe Slicer application changes, but most likely needs to be fixed in SlicerDMRI.

jhlegarreta commented 1 year ago

You need to specify the file type, because (as described in the method documentation) by default VolumeFile is used.

@lassoan the file type is being specified in all the failing instances.

Yes, indeed, there is an independent bug in SlicerDMRI that makes Slicer crash on Windows when you want to load a fiber bundle file.

The tests fails/crash on other operating systems, as mentioned.

lassoan commented 1 year ago

Can you provide an example where ModelFile loading fails?

jhlegarreta commented 1 year ago

ModelFile loading does not fail (see https://github.com/Slicer/Slicer/issues/7363#issue-1983777148): it fails when querying about fiber-specific properties: see the ToDo comment I added around this line: https://github.com/SlicerDMRI/SlicerDMRI/compare/master...jhlegarreta:SlicerDMRI:FixDataDownloadURLsInTests#diff-2df3de2b744083535216f185df68d92833d17a0c8ba92ae96d4436be412d4f17R105

So I'd say that the issue is not whether things can be made to work switching to ModelFile data types; there is something that needs to be fixed to load FiberBundleFile data types.

lassoan commented 1 year ago

So I'd say that the issue is not whether things can be made to work switching to ModelFile data types; there is something that needs to be fixed to load FiberBundleFile data types.

OK, that's my understanding, too. I'll update SampleData module to detect file type automatically if not specified. Other than that I don't see anything else that we could do in Slicer core to fix fiber bundle reading, as it the fiber bundle reader is implemented in SlicerDMRI.

jamesobutler commented 1 year ago

I was tracking down along https://github.com/SlicerDMRI/SlicerDMRI/commit/c0c7c83445b97c0ba0055ca4ce4a8acd4646ae45 and saw https://github.com/SlicerDMRI/SlicerDMRI/pull/123#issuecomment-484722200 about the creation of a node with tract1_1. Maybe these past details will help with the current work.

jhlegarreta commented 1 year ago

Other than that I don't see anything else that we could do in Slicer core to fix fiber bundle reading, as it the fiber bundle reader is implemented in SlicerDMRI.

OK, dug more into this:

Re https://github.com/Slicer/Slicer/issues/7363#issuecomment-1802867834 this is relevant to the overall thing, James. Thanks.

lassoan commented 1 year ago

We often get crashes or other errors in various places in Slicer when we update to a more recent VTK version. I've run the fiber bundle import in Visual Studio debugger to see what goes wrong and found that the crash is due to input is nullptr in vtkPolyDataTensorToColor::RequestData. It was because the input of the filter was an unstructured grid instead of polydata. Probably the selection filter in VTK was changed to produce unstructured grid instead of polydata. I think I could fix the issue by adding a conversion filter. See PR here: https://github.com/SlicerDMRI/SlicerDMRI/pull/208. I did not pay attention to formatting, naming, adding comments, etc. it is just a proof of concept for the fix.

jhlegarreta commented 1 year ago

Thanks for the effort @lassoan. Checked out the branch, compiled, run the tests, and the concerned tests pass :white_check_mark:.

@lassoan Feel free to amend the commit to add some relevant explanation about why such changes were necessary. Thank you for your work and support.

I will rebase the PR on master after PR https://github.com/SlicerDMRI/SlicerDMRI/pull/191 gets merged.

As a side note, this morning I had traced the issues down to the statement

this->GetMRMLScene()->AddNode(fiberBundleNode);

at https://github.com/SlicerDMRI/SlicerDMRI/blob/372675b96e917acb61bc8cb2ac9262d87cc5b06a/Modules/Loadable/TractographyDisplay/Logic/vtkSlicerFiberBundleLogic.cxx#L134

as the one triggering the error, not even close to what your investigation concluded. So definitely difficult to catch without extensive 3D Slicer knowledge.

Anyways, it has been productive, and a step towards better understanding about 3D Slicer, and its ecosystem.

Re https://github.com/Slicer/Slicer/issues/7363#issuecomment-1802867834 As for the ModelFile behavior of adding an e.g. _1 suffix to the node name, not sure if this is expected. The test failure when querying about the node name e.g. tract1 after applying https://github.com/SlicerDMRI/SlicerDMRI/pull/208 is still present: @jcfr may have more insight about this.

lassoan commented 1 year ago

As a side note, this morning I had traced the issues down to the statement

Yes, that's the same that I concluded. This call triggers a number of VTK events.

I agree that it was not very easy to catch the error. I ran the code in debug mode in Visual Studio debugger and it stopped right at the nullptr use and I could inspect the stack trace. I then added a null-pointer checks to avoid the crash (so that I can reproduce the issue many times without crashing the application). I added few breakpoints higher in the call stack, reproduced the error, and stepped through the code until I found where the filter input was set to nullptr. This did not require much Slicer-specific knowledge, but mostly experience with VTK debugging in an interactive debugger. It would have been nearly impossible to figure out what was happening just by adding log messages, reading code, or using a non-GUI debugger.

VTK developers tend to make subtle backward-incompatible changes all the time. Probably the only way to deal with this is to test all important features and fix any regressions. This is what happened here, so our processes worked (except we did not react to failures quickly).

As for the ModelFile behavior of adding an e.g. _1 suffix to the node name, not sure if this is expected.

In the Slicer GUI, nodes are typically identified by their name, so you would not want different nodes to have the same name. To save time for the user, Slicer adds suffix to the name to make it unique.

In code, it should not be necessary to find a node by name. Instead, you can save the node in a variable when the node is created.

jhlegarreta commented 1 year ago

Thanks for all this investigation and information Andras. I will need to properly set the debugging environment as time permits.

In the Slicer GUI, nodes are typically identified by their name, so you would not want different nodes to have the same name. To save time for the user, Slicer adds suffix to the name to make it unique.

I'd say that this happened when no previous node had been loaded, but I may be wrong.

lassoan commented 1 year ago

Proof of concept fix shows that SlicerDMRI needs to be updated, so I'm closing the issue here.