How to use shared memory in the inject device function?

NVlabs / NVBit

199 stars 18 forks source link

How to use shared memory in the inject device function? #41

Closed zz-Malfurion closed 3 years ago

zz-Malfurion commented 3 years ago

I want to use shared memory in the inject device function, but it cannot be compiled.

Device Code:

extern "C" __device__ __noinline__ void inject_kernel() {
    extern __shared__ int x[];
    printf("%d\n", x[0]);
}

Compiler Error:

ptxas error   : Allocating additional shared memory is not allowed when command line option '--compile-as-tools-patch' is specified

Is there any way to use shared memory inside the inject function?

zz-Malfurion commented 3 years ago

Ohh, I just found this limitation in the paper.

Injected functions may not use shared and constant memory because that memory can be used by the application itself.

But in my inject_kernel, I just want to read the shared memory of the origin kernel instead of modifying it. That is also not possible?

ovilla commented 3 years ago

We currently don't have an interface for reading shared memory within an injected function. We will keep it in mind for next version, but unfortunately we will not be able to explore this in short time as it requires non trivial amount of plumbing.

zz-Malfurion commented 3 years ago

Can I pass the c[0x0][0x18] and c[0x0][0x1c] as the base address of the shared memory? (BTW, I'm using GV100).

It works for me in this simple test:

extern "C" __device__ __noinline__ void inject_kernel(unsigned int r2, unsigned int r3) {
    unsigned long x = ((unsigned long)r3 << 32) | r2;
    long *shared_mem_ptr = (long *)x;
    printf("%ld\n", *shared_mem_ptr);
}

ovilla commented 3 years ago

We never tried that explicitly, but it seems promising.

From the earlier post I thought you were looking for an API to dump the entire content of shmem (without knowing a particular address), but if you know the exact address I guess you can do the way you are doing above... it should be safe.

zz-Malfurion commented 3 years ago

Thanks for your explanation.