NVIDIA / NVTX

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.
Apache License 2.0
309 stars 48 forks source link

Python String Registration support #75

Closed Sipondo closed 1 year ago

Sipondo commented 1 year ago

Added support for string registration which is required for performance-critical tracking and NVTX annotation filtering in Nsight Systems/Compute.

Example:

import time
import nvtx
registration = nvtx.register_string("Range A", domain="my_domain")
range_id = nvtx.start_range(registration)
time.sleep(1)
nvtx.end_range(range_id)
shwina commented 1 year ago

Thanks @Sipondo - this is looking mostly good to me!

Just curious, do you think we could go ahead and register all strings passed to annotate/push/start transparently on the user's behalf? Of course, this will lead to an increase in memory usage but I wonder if in typical applications, that memory use is small or bounded enough to be OK. Doing this would allow us to keep the NVTX Python API smaller/simpler.

(Feel free to let me know if you think this is a bad idea!)

Sipondo commented 1 year ago

Thank you for your reply, @shwina .

Makes sense to me! As long as the documentation makes this explicit.

From what I've seen these Python bindings are not used in situations where performance is so critical that such an increase in overhead would be problematic - in that case, one should probably refer to a C implementation instead. The degree of string registration will likely matter little to none for the applications this API was designed for.

That said, string registration is required for NVTX filtering in Nsys and NCU. Automatically registering all strings will definitely remove some confusion and development time when people want to filter using NVTX.

shwina commented 1 year ago

Thanks! -- that's good to know that you have the same impression. So can we go ahead and make the following changes in this PR?

  1. Go back to accepting only str objects for the message= argument in annotate/push/start_range, and create RegisteredString objects behind the scenes.
  2. Add a note in the docstrings that all messages are cached, so creating a very large number of annotations/ranges can lead to increase memory usage.

If (2) ever becomes an issue, we can perhaps look into a global config option that controls whether or not to cache message strings.

Sipondo commented 1 year ago

Those changes sound good.

How would you suggest to cache the RegisteredString objects? Should we keep a private dictionary in nvtx.py?

shwina commented 1 year ago

How would you suggest to cache the RegisteredString objects?

Because RegisteredString has the metaclass CachedInstanceMeta, that should already be taken care of for you:

In [7]: s1 = RegisteredString(domain, "hello")

In [8]: s2 = RegisteredString(domain, "hello")

In [9]: s3 = RegisteredString(domain, "goodbye")

In [10]: s1 is s2
Out[10]: True

In [11]: s1 is s3
Out[11]: False

In [12]: domain2 = nvtx._lib.Domain("domain2")

In [13]: s4 = RegisteredString(domain2, "hello")

In [14]: s1 is s4
Out[14]: False
Sipondo commented 1 year ago

Ah, right! I was familiar with the behaviour of the current cached strings hadn't checked source. Thanks for confirming.

Sipondo commented 1 year ago

@shwina I've committed the changes as we discussed. This commit also includes compatibility with the profiler side of things and I fixed a small unrelated issue with categories.

shwina commented 1 year ago

Just a couple of comments, after which this should be good to go! I'll test locally, merge, and cut a new release of NVTX with this feature!

shwina commented 1 year ago

Gentle ping @Sipondo -- do you think you can take this over the finish line? If you're pressed for time I can do so as well and merge it!

Sipondo commented 1 year ago

Thanks for the help! Looking forward to seeing the new library release.

shwina commented 1 year ago

Thanks! Just uploaded NVTX 0.2.7 to PyPI which should have this feature

Sipondo commented 1 year ago

Comment moved to its own issue: https://github.com/NVIDIA/NVTX/issues/78