KernelTuner / kernel_tuner

Kernel Tuner
https://kerneltuner.github.io/kernel_tuner/
Apache License 2.0
291 stars 49 forks source link

Register observer & correct clock setting #242

Closed fjwillemsen closed 7 months ago

fjwillemsen commented 9 months ago

This pull request adds a built-in Register Observer. This observer works for the PyCUDA, CuPy, and CUDA-Python backends. On unsupported backends, it gives a NotImplementedError. In addition, the pull request improves the efficiency with which clocks are set, and does not count the time spent doing so towards the benchmark time.

fjwillemsen commented 9 months ago

I happened to stumble upon this PR and out of curiosity had a look at the changes and couldn't help to add some comments. My main 'complaint't is that some scope creep seems to have occurred regarding the L2 flushing. In my opinion, it should really be moved into a separate PR.

@csbnw good comments! There is indeed some feature creep in this PR due to ongoing research 😅 the L2 stuff was just in here so Ben could read along, but I'll indeed create a separate PR for it. Converting back to draft for now.

sonarcloud[bot] commented 9 months ago

Quality Gate Passed Quality Gate passed

Issues
6 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

sonarcloud[bot] commented 7 months ago

Quality Gate Passed Quality Gate passed

Issues
3 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud