Closed fjwillemsen closed 7 months ago
I happened to stumble upon this PR and out of curiosity had a look at the changes and couldn't help to add some comments. My main 'complaint't is that some scope creep seems to have occurred regarding the L2 flushing. In my opinion, it should really be moved into a separate PR.
@csbnw good comments! There is indeed some feature creep in this PR due to ongoing research 😅 the L2 stuff was just in here so Ben could read along, but I'll indeed create a separate PR for it. Converting back to draft for now.
Issues
6 New issues
0 Accepted issues
Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code
Issues
3 New issues
0 Accepted issues
Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code
This pull request adds a built-in Register Observer. This observer works for the PyCUDA, CuPy, and CUDA-Python backends. On unsupported backends, it gives a
NotImplementedError
. In addition, the pull request improves the efficiency with which clocks are set, and does not count the time spent doing so towards the benchmark time.