Closed jrhemstad closed 3 years ago
I think it's worth clarifying here that "thread" or "scoped" range objects are trivial RAII wrappers for nvtxRangePush/nvtxRangePop in the NVTX C API, and "process" or "unique" range objects are trivial RAII wrappers for nvtxRangeStart/nvtxRangeEnd.
Note that the choice of "process_range" and "thread_range" was originally based on the behavior of developer tools with hierarchical organization placing the former on a row under a Process node, and the latter under a Thread node. However, the NVTX C API docs don't directly make this distinction that RangeStart/End == process, while they do clarify RangePush/Pop is inherently specific to a CPU thread. Tools no longer make this relationship as stark as they used to -- Nsight Systems detects when both the Start/End calls for a range occur on the same thread, and show it under a Thread node, in addition to ranges that only Start or only End on that thread with visual indication of which end is related.
Ideally there would have just been one kind of range... RangeStart gives you a handle, and you pass that handle to RangeEnd. But since passing things from the beginning of a range to the end can be difficult or inefficient (consider a tool getting a "before" callback and an "after" callback around something), the Push/Pop type was added as an optimization for the case where tools could infer the relationship between the calls. There are measurable perf benefits of using Push/Pop instead of Start/End wherever possible.
The expectation for a modern C++ RAII wrapper around a handle-based kind of range would be that it is movable. So it makes sense to name it something that C++ programmers would recognize as a move-only handle class to an object that needs to be destructed when that one reference to it is dropped. So "nvtx3::unique_range" is a fairly obvious choice, given that it is implemented using std::unique_ptr, and has the same semantics as unique_ptr, unique_lock, unique_resource, etc. Also, while C++20 adds a whole API for "Ranges", these are pairs of iterators in a structure, not really confusable with NVTX ranges of code execution in a program. Even https://en.cppreference.com/w/cpp/algorithm/ranges/unique would be hard to confuse with an nvtx3::unique_range, given that it's an algorithm that rearranges a container to have repeats of values at the end.
The wrapper for Push/Pop ranges doesn't have as clear a choice, but scoped_range is the most familiar to anyone who uses scoped lock objects to manage mutexes. Boost had a scoped_lock for years, and now it's in C++17. Our "scoped" is a bit different in that it prevent usage anywhere but in stack variables, and is not movable. But since it's effectively hard-locked to a lexical scope, which must stay on one CPU thread, that does imply it uses the NVTX functions documented to be specific to a CPU thread. And since lexical scopes nest like a stack, it's easy to connect that to the Push/Pop C API names. The only other name I thought could make sense for these was "stack_range" since it's an even more direct implication, but the familiarity of "scoped_range" being like the tried and true "scoped_lock" gives that one a big advantage.
So I agree with Jake that "scoped_range" is sensible name for the thing you'd normally use, and if you get our polite compiler error message for trying to move it or use it in heap objects, "unique_range" is a sensible name for the thing you'd use instead.
closed by https://github.com/NVIDIA/NVTX/pull/42
Currently in NVTX++ we have:
domain_thread_range<D>
(andthread_range
alias for global domain)operator new
prevents heap objectsdomain_process_range<D>
(process_range
alias for global domain)The naming leaves something to be desired.
thread_range
comes from the concept that the object needs to be created and destroyed on the same thread, but doesn't tell you anything else about it's semantics.process_range
is named primarily to contrast it tothread_range
in that it can be created/destroyed on different threads.We can do better. I propose the following:
domain_thread_range<D>/thread_range -> scoped_range_in<D>/scoped_range
domain_process_range<D>/process_range -> unique_range_in<D>/unique_range
First, the
*_in<D>
better conveys that the range is in the particular domain as opposed to calling it adomain_*<D>
. (credit to @jcohen-nvidia for this idea).Second, the
scoped_range/unique_range
naming is inspired bystd::scoped_lock
andstd::unique_lock
.Like
scoped_range
,scoped_lock
is neither copyable nor movable. It conveys that this object is intrinsically tied to a particular scope.Like
unique_range
,unique_lock
is movable, but not copyable. It represents a single, unique range that is tied to the lifetime of the object and not confined to a particular scope. Furthermore, the canonical name in C++ for "movable but not copyable" things isunique_*
.Furthermore, just as
scoped_lock
is typically preferred tounique_lock
,scoped_range
is preferred tounique_range
.