InsightSoftwareConsortium / ITK

Insight Toolkit (ITK) -- Official Repository. ITK builds on a proven, spatially-oriented architecture for processing, segmentation, and registration of scientific images in two, three, or more dimensions.
https://itk.org
Apache License 2.0
1.43k stars 668 forks source link

itkErodeObjectMorphologyImageFilterTest write/write race with SetPixel() #4969

Closed seanm closed 2 days ago

seanm commented 1 week ago

Running the itkErodeObjectMorphologyImageFilterTest test with TSan:

WARNING: ThreadSanitizer: data race (pid=9174)
  Write of size 2 at 0x000107a039c8 by thread T10:
    #0 itk::NeighborhoodAccessorFunctor<itk::Image<unsigned short, 2u>>::Set(unsigned short*, unsigned short const&) const itkNeighborhoodAccessorFunctor.h:78 (ITKBinaryMathematicalMorphologyTestDriver:arm64+0x1001e7924)
    #1 itk::NeighborhoodIterator<itk::Image<unsigned short, 2u>, itk::ZeroFluxNeumannBoundaryCondition<itk::Image<unsigned short, 2u>, itk::Image<unsigned short, 2u>>>::SetPixel(unsigned int, unsigned short const&, bool&) itkNeighborhoodIterator.hxx:108 (ITKBinaryMathematicalMorphologyTestDriver:arm64+0x1001e760c)
    #2 itk::ErodeObjectMorphologyImageFilter<itk::Image<unsigned short, 2u>, itk::Image<unsigned short, 2u>, itk::BinaryBallStructuringElement<unsigned short, 2u, itk::NeighborhoodAllocator<unsigned short>>>::Evaluate(itk::NeighborhoodIterator<itk::Image<unsigned short, 2u>, itk::ZeroFluxNeumannBoundaryCondition<itk::Image<unsigned short, 2u>, itk::Image<unsigned short, 2u>>>&, itk::BinaryBallStructuringElement<unsigned short, 2u, itk::NeighborhoodAllocator<unsigned short>> const&) itkErodeObjectMorphologyImageFilter.hxx:48 (ITKBinaryMathematicalMorphologyTestDriver:arm64+0x1001dfd84)
    #3 itk::ObjectMorphologyImageFilter<itk::Image<unsigned short, 2u>, itk::Image<unsigned short, 2u>, itk::BinaryBallStructuringElement<unsigned short, 2u, itk::NeighborhoodAllocator<unsigned short>>>::DynamicThreadedGenerateData(itk::ImageRegion<2u> const&) itkObjectMorphologyImageFilter.hxx:159 (ITKBinaryMathematicalMorphologyTestDriver:arm64+0x1001df1c4)
<snip>

  Previous write of size 2 at 0x000107a039c8 by thread T11:
    #0 itk::DefaultPixelAccessor<unsigned short>::Set(unsigned short&, unsigned short const&) const itkDefaultPixelAccessor.h:69 (ITKBinaryMathematicalMorphologyTestDriver:arm64+0x1001300cc)
    #1 itk::DefaultPixelAccessorFunctor<itk::Image<unsigned short, 2u>>::Set(unsigned short&, unsigned short const&) const itkDefaultPixelAccessorFunctor.h:99 (ITKBinaryMathematicalMorphologyTestDriver:arm64+0x100130058)
    #2 itk::ImageRegionIterator<itk::Image<unsigned short, 2u>>::Set(unsigned short const&) const itkImageRegionIterator.h:118 (ITKBinaryMathematicalMorphologyTestDriver:arm64+0x10012f2d0)
    #3 itk::ObjectMorphologyImageFilter<itk::Image<unsigned short, 2u>, itk::Image<unsigned short, 2u>, itk::BinaryBallStructuringElement<unsigned short, 2u, itk::NeighborhoodAllocator<unsigned short>>>::DynamicThreadedGenerateData(itk::ImageRegion<2u> const&) itkObjectMorphologyImageFilter.hxx:119 (ITKBinaryMathematicalMorphologyTestDriver:arm64+0x1001dee60)
<snip>

We see from the above that one thread is writing to 0x000107a039c8 and another is simultaneously writing to the same address. I didn't check if each thread is writing the same value or not.

Here is the relevant code, NB the fire emojis:

template <typename TInputImage, typename TOutputImage, typename TKernel>
void
ObjectMorphologyImageFilter<TInputImage, TOutputImage, TKernel>::DynamicThreadedGenerateData(
  const OutputImageRegionType & outputRegionForThread)
{
  ImageRegionConstIterator<TInputImage> iRegIter;
  ImageRegionIterator<TOutputImage>     oRegIter;
  iRegIter = ImageRegionConstIterator<InputImageType>(this->GetInput(), outputRegionForThread);
  oRegIter = ImageRegionIterator<OutputImageType>(this->GetOutput(), outputRegionForThread);
  /* Copy the input image to the output image - then only boundary pixels
   * need to be changed in the output image */
  while (!oRegIter.IsAtEnd())
  {
    if (Math::NotExactlyEquals(oRegIter.Get(), m_ObjectValue))
    {
      oRegIter.Set(iRegIter.Get());// πŸ”₯πŸ”₯πŸ”₯πŸ”₯ "Previous write of size 2"
    }
    ++oRegIter;
    ++iRegIter;
  }

  // Find the boundary "faces"
  NeighborhoodAlgorithm::ImageBoundaryFacesCalculator<InputImageType>                        fC;
  typename NeighborhoodAlgorithm::ImageBoundaryFacesCalculator<InputImageType>::FaceListType faceList =
    fC(this->GetInput(), outputRegionForThread, m_Kernel.GetRadius());

  // Setup the kernel that spans the immediate neighbors of the current
  // input pixel - used to determine if that pixel abuts a non-object
  // pixel, i.e., is a boundary pixel
  constexpr auto bKernelSize = MakeFilled<RadiusType>(1);

  TotalProgressReporter progress(this, this->GetOutput()->GetRequestedRegion().GetNumberOfPixels());

  OutputNeighborhoodIteratorType oSNIter;
  InputNeighborhoodIteratorType  iSNIter;
  for (const auto & face : faceList)
  {
    oSNIter = OutputNeighborhoodIteratorType(m_Kernel.GetRadius(), this->GetOutput(), face);
    // No need to overwrite on output...and m_BoundaryCondition is
    // templated over inputImageType - and cannot be applied to the
    // output image
    // oSNIter.OverrideBoundaryCondition(m_BoundaryCondition);
    oSNIter.GoToBegin();

    iSNIter = InputNeighborhoodIteratorType(bKernelSize, this->GetInput(), face);
    iSNIter.OverrideBoundaryCondition(m_BoundaryCondition);
    iSNIter.GoToBegin();

    while (!iSNIter.IsAtEnd())
    {
      if (Math::ExactlyEquals(iSNIter.GetCenterPixel(), m_ObjectValue))
      {
        if (this->IsObjectPixelOnBoundary(iSNIter))
        {
          this->Evaluate(oSNIter, m_Kernel); // πŸ”₯πŸ”₯πŸ”₯πŸ”₯ "Write of size 2", this Evaluate() calls SetPixel()
        }
      }
      ++iSNIter;
      ++oSNIter;
      progress.CompletedPixel();
    }
  }
}
N-Dekker commented 6 days ago

Juts wondering... Could it be that multiple threads share one and the same region as outputRegionForThread? Or at least, could there be any overlap between the output regions of threads?

Otherwise, could it possibly be that input and output image may be one and the same image?

Honestly, I'm clueless 🀷

blowekamp commented 6 days ago

Honestly, I'm clueless 🀷

I see two potential issues. It looks like the neighborhood iterators may go outside the outputRegionForThread.

  1. The first loop copies to the output, but the second loop may use pixel outside of thread region pixels as input.
  2. I am not sure if the "Evaluate" method writes outside of the thread region.

Edited for sanity.

N-Dekker commented 6 days ago

Thanks @blowekamp What do you think about the use of a "face list" in this case? As in: https://github.com/InsightSoftwareConsortium/ITK/blob/18d683ec792bfa7b7a950f781cf9836a5ef64a4d/Modules/Filtering/BinaryMathematicalMorphology/include/itkObjectMorphologyImageFilter.hxx#L140-L142

I think such a face list is meant to allow dealing with the "inner face" (center sub-region) differently than with the boundary faces. Typically, the boundary faces might need a different way to extrapolate or estimate outer pixel values. Right?

However, it looks like ObjectMorphologyImageFilter treats the boundary faces just like the inner face. It just processed all faces, when it does for (const auto & face : faceList). (The first element of faceList is the inner face, the other elements are the boundary faces.) Is that OK?

blowekamp commented 4 days ago

I'll look into this today, after I address a build error I encountered.

N-Dekker commented 4 days ago

However, it looks like ObjectMorphologyImageFilter treats the boundary faces just like the inner face. It just processed all faces, when it does for (const auto & face : faceList). (The first element of faceList is the inner face, the other elements are the boundary faces.) Is that OK?

Elaborating on my question regarding the use of faces: Originally (in 2002), ObjectMorphologyImageFilter did treat the first face (the inner sub-region) different than the other faces (the boundary sub-regions).

https://github.com/InsightSoftwareConsortium/ITK/blob/81b20de6878cb88f1b92d783b474a38174cfd071/Code/BasicFilters/itkObjectMorphologyImageFilter.txx#L138-L144

For the other faces (the boundary sub-regions), a different boundary condition was used:

https://github.com/InsightSoftwareConsortium/ITK/blob/81b20de6878cb88f1b92d783b474a38174cfd071/Code/BasicFilters/itkObjectMorphologyImageFilter.txx#L165-L174

That makes totally sense to me. However, from 43436bf8ab71e250a624f0225e37a6e19c646576 (2023), all faces were treated equally. I think that's a mistake. The center sub-region (the first face) is special, so it deserves special treatment πŸ˜ƒ

dzenanz commented 4 days ago

Feel free to turn that into a PR πŸ˜„

dzenanz commented 4 days ago

After this one is finished πŸ˜„

blowekamp commented 4 days ago

That makes totally sense to me. However, from 43436bf (2023), all faces were treated equally. I think that's a mistake. The center sub-region (the first face) is special, so it deserves special treatment πŸ˜ƒ

The correctness is a separate issue from the race condition.

Perhaps the real issue with identifying what the correct behavior it that the regression tests only print pixel to the std output: https://github.com/InsightSoftwareConsortium/ITK/blob/11ea27f311370ca996a1f29b35bc954bcfe2987e/Modules/Filtering/BinaryMathematicalMorphology/test/itkErodeObjectMorphologyImageFilterTest.cxx#L108-L117

So we really don't know what it should be doing is expected to do.

N-Dekker commented 4 days ago

Indeed @blowekamp, it would have been better if it had some TEST_ASSERT or TEST_EXPECT to specify the expected behavior. Like modern unit tests following the AAA - Arrange, Act, Assert pattern. (Hey, didn't "AAA" mean Almost Always Auto?!? 😸 )

N-Dekker commented 4 days ago

Could the race condition possibly be caused by commenting out the boundary condition of the output iterator?

https://github.com/InsightSoftwareConsortium/ITK/blob/18d683ec792bfa7b7a950f781cf9836a5ef64a4d/Modules/Filtering/BinaryMathematicalMorphology/include/itkObjectMorphologyImageFilter.hxx#L143-L146

By commit c58537485bc04aa91ef1972094969e4ea4c4bd53 (2009). Maybe the boundary condition used to protect the outer pixels of the output region against accidental writes...?

blowekamp commented 4 days ago

I was able to reproduce the defect in the test, and this change solved it. Also not the annotated "write" in the reported issue.

I am not asserting that this fixes, the total algorithm, just was is reported in the issue. The algorithm clearly makes use of output that is copied from other thread's regions.

seanm commented 2 days ago

Test is now passing under TSan today: https://open.cdash.org/viewTest.php?onlydelta&buildid=10051859

Thanks!

N-Dekker commented 2 days ago

@seanm Just wondering, do you use this filter yourself? Because we had some difficulties, trying to understand exactly how it works.

seanm commented 2 days ago

I'm afraid I don't use it. I'm just hoping to have ITK pass all its tests under TSan one day. There are only a few left to fix!