inducer / pyopencl

OpenCL integration for Python, plus shiny features

http://mathema.tician.de/software/pyopencl

Other

1.06k stars 242 forks source link

Wrap clCreateCommandQueueWithProperties #198

Open inducer opened 7 years ago

inducer commented 7 years ago

This would be the pattern to follow: https://github.com/inducer/pyopencl/blob/21b09e316b00765d9c1612d4ad6b078003939049/src/c_wrapper/command_queue.cpp#L42-L66

glupescu commented 6 years ago

I tried the below quick hack but I am getting INVALID_COMMAND_QUEUE later on. There is no error upon calling clCreateCommandQueue though. Also setting an invalid QUEUE_SIZE does result in INVALID_QUEUE_VALUE, for example on AMD creating a queue with 16MB size when 8MB is max.

https://github.com/glupescu/pyopencl/commit/5874ce02a2e28cc40c431d8e839590f06242b6cd#diff-93de4fb0b5fa13ed2f66448c780b3102

From stackoverflow https://stackoverflow.com/questions/45767759/how-to-set-device-side-queue-size-in-pyopencl/49957843#49957843

inducer commented 6 years ago

Sorry, I don't have the spare cycles at this moment to investigate in detail. I've put this on my list for later in the summer.

inducer commented 6 years ago

240 adds support for this. I'd be happy to hear your feedback.

glupescu commented 6 years ago

Will definitely check this out soon - thanks for adding support on this.

atypic commented 4 years ago

Kicking this slightly back to life: did you manage to get a device-side queue working through pyopencl ever? I have spent the better part of today trying to make this work, but the closest i've gotten is the queue being created and a "clEnqueueNDRangeKernel failed: INVALID_COMMAND_QUEUE" being thrown at me when I try to enqueue a dumb kernel (that does nothing).

inducer commented 4 years ago

What ICD (OpenCL driver) are you using?

atypic commented 4 years ago

Edit: PEBCAK

The ranting below here is because i didn't understand that you can't enqueue to a device side queue from the host side. You need 2 queues. One on the host, one on the device. You can mark the device queue as default.

-- I've tried both the Nvidia(1.2) and intel (2.1) runtimes. The method complains about incompability when i use nvidia, of course.

Both using this way: cl.CommandQueue(self._cl_context, properties=cmcq.ON_DEVICE | cmcq.ON_DEVICE_DEFAULT | cmcq.OUT_OF_ORDER_EXEC_MODE_ENABLE)

and... cl.CommandQueue(self._cl_context, properties = [cmq.PROPERTIES, cmcq.ON_DEVICE | cmcq.ON_DEVICE_DEFAULT | cmcq.OUT_OF_ORDER_EXEC_MODE_ENABLE, cmq.SIZE, 1024]) leads to pyopencl._cl.LogicError: clEnqueueNDRangeKernel failed: INVALID_COMMAND_QUEUE

actually, I lie, on nVidia this leads to Segfault, though I have read that the ...withProperties() function is supported now.

Removing this and simply making an in-order on-host queue (default) the kernel runs fine...

inducer commented 4 years ago

Thanks for following up! Just to be clear: Did you get things to work on Intel? (I'd expect that to work more than I'd epxect the same of Nvidia.)

atypic commented 4 years ago

Eh!

It's complicated. So, I am for sure able to create on-device queues on both intel and nvidia platforms. I have made the following observations:

Using the ...withProperties()-call is required for doing this on nvidia. For intel I can use both calls and it works: but only on certain cards. My desktop has a 1660 and it doesn't work (OUT OF RESOURCES error), but the same code on a Tesla V100 works. I have an AMD card as well that throws "out of host memory" when I try to make the second queue using the withProperties() function, but I am able to use the 'normal' CreateCommandQueue().
I can enque_kernel() on both intel and nivida: BUT, on both platforms I get hangs if I do not turn off code caching. No idea why.

inducer commented 4 years ago

Thanks for reporting back! Could you share some example code? I'd like to include that in the tests, if for no other reason than to make sure that the things that are working stay working.