InsightSoftwareConsortium / ITK

Insight Toolkit (ITK) -- Official Repository. ITK builds on a proven, spatially-oriented architecture for processing, segmentation, and registration of scientific images in two, three, or more dimensions.
https://itk.org
Apache License 2.0
1.43k stars 666 forks source link

itk(GDCM|DCMTK)SeriesFileNames: memory leaks, bad behavior #2735

Open PHLF opened 3 years ago

PHLF commented 3 years ago

Hello,

I may have not searched thoroughly but I am surprised not to have found an issue about this:

image

But you know... who read the doc anyway? (Please don't take this too seriously ;) )

So out of luck and running out of time, instead of patching ITK's GDCM I tried to use itkDCMTKSeriesFileNames which does not suffer from memory leaks, but... when you are out of luck you should stop trying:

Thank you very much for your time and for your work on this library.

dzenanz commented 3 years ago

I don't think that memory leaks are known. @malaterre @issakomi correct me if I am wrong. @PHLF do you have a simple repro case for a memory leak?

PHLF commented 3 years ago

Something like:

#include <itkGDCMImageIO.h>
#include <itkGDCMSeriesFileNames.h>

#include <filesystem>
#include <vector>

int main()
{
  using SeriesIdContainer = std::vector<std::string>;
  SeriesIdContainer seriesUIDs;

  auto nameGenerator = itk::GDCMSeriesFileNames ::New();

  nameGenerator->SetUseSeriesDetails(true);
  nameGenerator->AddSeriesRestriction("0020|000d");
  nameGenerator->SetGlobalWarningDisplay(false);
  nameGenerator->SetRecursive(false);
  nameGenerator->SetLoadSequences(false);

  namespace fs = std::filesystem;

  fs::path inDcmRootDir = ".";

  // inDcmRootDir is a directory containing DICOM data subfolders
  for (auto const& dirEntry : fs::recursive_directory_iterator{inDcmRootDir})
  {
    if (!dirEntry.is_directory())
    {
      continue;
    }
    // I am pretty sure the leak happen during the following call as GDCMSeriesFileNames call "m_SerieHelper->Clear()"
    nameGenerator->SetInputDirectory(dirEntry.path());

    seriesUIDs = nameGenerator->GetSeriesUIDs();

    for(const auto& seriesUID : seriesUIDs)
    {
      nameGenerator->GetFileNames(seriesUID);
    }
  }

  return EXIT_SUCCESS;
}
PHLF commented 3 years ago

But maybe I did not use the API properly and there is no leak at all.

I forgot to mention: I am using ITK 5.2.0 on a x86_64 debian bullseye system

issakomi commented 3 years ago

I have tried to run the above test with Valgrind on Debian 11, ITK-5.2.1 release. No leaks or other errors were detected. I don't use the classes in my apps, maybe i am missing something. Could you reproduce leaks with Valgrind? And yes, probably it were not bad to review gdcmSeriesHelper.cxx.

PHLF commented 3 years ago

Could you reproduce leaks with Valgrind?

When running my application with massif (valgrind heap analyzer) the allocated memory reaches a limit value for an unknown reason and never goes OOM. But not using valgrind, the same application goes OOM when I use the path of a folder containing several dicoms as a parameter to the itk::GDCMSeriesFileNames::SetInputDirectory: I clearly see an amount of memory being allocated corresponding to the size of my folder's content. And this memory is never freed when I call itk::GDCMSeriesFileNames::SetInputDirectory with a different folder or simpler when I instantiate an itk::GDCMSeriesFileNames owned by a smartpointer in a loop: destroying the itk::GDCMSeriesFileNames does not deallocate the memory corresponding to the DICOM contents.

Looking further in gdcm::SerieHelper used by itk::GDCMSeriesFileNames, on can see that the whole file content of each iterated .dcm file is copied into a FileList's node and these nodes are never destroyed as the call to the delete operator is commented in the gdcm::SerieHelper::Clear function (that's my two cents hypothesis).

I may be missing something as the FileList is making use of SmartPointers and they should free the memory when the FileList is cleared, despite the commented call to "delete", but that is definitely not what I observe.

issakomi commented 3 years ago

I clearly see an amount of memory being allocated corresponding to the size of my folder's content. And this memory is never freed

OK, i have looked at it again. Have tried folder with DICOM sub-folders, 1GB. In fact, the scan seems to stress the memory, it grows and grows, after the scan is done with one sub-folder, it start another one and memory is not freed. Valgrind still doesn't detect a leak, because memory is freed at exit.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.