boostorg / compute

A C++ GPU Computing Library for OpenCL
http://boostorg.github.io/compute/
Boost Software License 1.0
1.55k stars 333 forks source link

Memory leak problem #785

Open antibes0415 opened 6 years ago

antibes0415 commented 6 years ago

Hi. I use the most recent develop branch. Here's my problem description: I'm a developer of a software. After adding some codes of boost.compute, the software would crash (program stopped working) after exiting. Actually, all boost.compute objects should be destructed after use so that there shouldn't be crash problem after exiting the whole software. The software uses other 3D librarys like vtk to display 3D mesh. When exiting, it has a memory release mechanism to release GPU memory. So I think the problem may be repetitive release of GPU memory. And there could really still be some memory leak problem. After trying some tests, I found that the the exit crash happens when the following codes are used:

cl::Buffer m_dev_sampleKeys; boost::compute::buffer cbSampleKeys((m_dev_sampleKeys)()); boost::compute::buffer_iterator iter = boost::compute::make_buffer_iterator(cbSampleKeys, 0); boost::compute::sort( iter, iter + m_nbSamples, m_compute_queue );

I have to note that all these boost.compute codes are run in a class object, which will be destructed immediately after use. So in normal case, it should not affect the proccess in software exiting.

So I guess there really may be some problems in buffer or buffer_iterator or sorting. Could you check them?

Ulfgard commented 6 years ago

Could it be that this is #746 which got fixed a few days ago?

antibes0415 commented 6 years ago

Hi. I'm on develop branch and I've checked it's most recent. The problem does not apper when just running my console application. Maybe windows would automatically collect memory leaks and garbages so there's no error in console.

antibes0415 commented 6 years ago

Hi. I suppose it's some static object in boost.compute causing the crash. Static objects release memory in the last, which may cause some problem. In my case, I pass all context, queue with existed OpenCL objects (e.g. cl_context). I found that when I use boost::compute::sort (sort like several million numbers), the software would crash when exiting. Here is my program debug interface when crashing: 2018-06-01 11_18_36-d3dvideo debugging - microsoft visual studio administrator

2018-06-01 11_28_02-d3dvideo debugging - microsoft visual studio administrator

antibes0415 commented 6 years ago

Hi. I add this line in the destructor of my Class and my software stops crashing after exiting. boost::compute::program_cache::get_global_cache(m_compute_context).get()->clear();

in which m_compute_context is set by: cl::Context m_context = cl::Context(CL_DEVICE_TYPE_GPU, ...); boost::compute::context m_compute_context = boost::compute::context(m_context());

I'll keep watching.

Ulfgard commented 6 years ago

I think the issue is the following:

  1. you create the context and tell boost compute about it.
  2. you create kernels, those are cached in the global cache
  3. you destroy your context, which frees all resources
  4. at program exit, the cache is cleared, this leads to a double free/corruption.

//Edit could you try not creating your own context but using the context provided by boost::compute? or is this not possible in your application?

antibes0415 commented 6 years ago

I have to create my own context because I write some complex kernels, and I need to use boost.compute and viennacl. So I need to pass the wrapper of my cl_context, cl_device and cl_mem etc. to boost.compute and viennacl.

Ulfgard commented 6 years ago

this was just to test whether this fixes the problem (so that we know that this is the culprit).

you could query device, context and queue using boost::compute::system instead of providing your own. https://www.boost.org/doc/libs/1_62_0/boost/compute/system.hpp

cl_context& context = boost::compute::system::default_context().get();
cl_device_id& device = boost::compute::system::default_device().get();
command_queue& device = boost::compute::system::default_queue().get();
antibes0415 commented 6 years ago

Thanks. Actually, I use OpenCL 1.2 C++ wrapper. I can pass the cl_context to the cl::Context, but it would automatically destruct the context (release twice) and causes the crash in the end of the program. Don't know why. I suppose it should retain the context (counter + 1). I have commented the line boost::compute::program_cache::get_global_cache(m_compute_context).get()->clear(); in my class destructor. So Maybe it's difficult for me to test the cl_context got from boost::compute::system::default_context().get(). My code looks like this:

  cl::Context m_context;
  std::vector<cl::Device> m_devices;
  cl::CommandQueue m_queue;
  boost::compute::device m_compute_device;
  boost::compute::context m_compute_context;
  boost::compute::command_queue m_compute_queue;

bool PoissonReconGPU::initializeOpenCL() {

  m_compute_context = boost::compute::system::default_context();
  m_compute_device = boost::compute::system::default_device();
  m_compute_queue = boost::compute::system::default_queue();

  m_context = cl::Context(m_compute_context.get());
  m_devices.push_back(cl::Device(m_compute_device.get()));
  m_queue = cl::CommandQueue(m_compute_queue.get());
...
}