-
For CUDA profiling and potentially debugging purposes, it is important to be able to correlate high-level numba Python code with low-level assembly. This is achieved by providing lineinfo during compi…
fs-nv updated
1 month ago
-
The proposal here is to extend and structure #195 as a module than _just_ a method on the GP. It is still nice to have quick access to this, but structuring it this way would:
(1) facilitate enhanc…
-
Hi there,
This request is a follow-up to the discussion here: https://github.com/google/jax/discussions/18065.
### What we're trying to accomplish
Suppose we have our own JIT compiler, called…
-
I hope this message finds you well. I am currently reviewing your open-source code related to [S-Rocket: Selective Random Convolution Kernels for Time Series Classification], and I have a question r…
Dr0pX updated
4 weeks ago
-
When tracing kernel functions, I see too many unrelated kernel functions in the trace. It might be because of security policies or some other reasons, but it makes the tracing output too much complica…
-
**Is your feature request related to a problem? Please describe**
In CUDA, a kernel functor can be copied to a CUDA symbol, which can be marked \_\_const\_\_. This enables the compiler to perform v…
-
I have installed Jupyter lab debugger in my windows 10.
I have followed explanation in: https://blog.jupyter.org/a-visual-debugger-for-jupyter-914e61716559
It was working ok. Now I'm executing …
-
Here is a simple hello world program.
```c
$ cat hello.c
#include
int main()
{
printf("Hello\n");
fflush(stdout);
return 0;
}
```
If we try kernel tracing, the ou…
-
Hello, in some of my kernel functions, the APIs I call trigger some asynchronous operations in the background. So, I need to check the result of operations from another source. Thats why, I shouldn’t …
-
CUDA 6.5 introduced several useful functions for computing optimal occupancy. See https://developer.nvidia.com/blog/cuda-pro-tip-occupancy-api-simplifies-launch-configuration/ for details. But as a fi…