Closed AntoineFroger closed 3 years ago
I have some code written to implement those macros. Just need to test it before I submit it for review.
The other thing this would require is for the default ctor of movable_domain_thread_range
to emit no range. I think we avoid introducing a whole new range type and avoid making thread_range
movable by using an IILE and introducing a tag type in domain_thread_range::no_range
with a domain_thread_range::domain_thread_range(domain_thread_range::no_range)
ctor that emits no range. Something like this:
#define NVTX3_V1_FUNC_RANGE_IF_IN(C, D) \
::nvtx3::v1::movable_domain_thread_range<D> const nvtx3_range__ = [&](){
if (C) { \
static ::nvtx3::v1::registered_string<D> const nvtx3_func_name__{__func__}; \
static ::nvtx3::v1::event_attributes const nvtx3_func_attr__{nvtx3_func_name__}; \
return ::nvtx3::v1::domain_thread_range<D>{nvtx3_func_attr__}; \
} else{
return ::nvtx3::v1::domain_thread_range<D>{::nvtx3::v1::domain_thread_range::no_range};
}
};
}
Hm, on second thought, my IILE idea won't work because it still requires the object to be movable...
Can we just use start_range/end_range
here instead?
Hm, on second thought, my IILE idea won't work because it still requires the object to be movable...
If we don't touch domain_thread_range
and add the movable_domain_thread_range
this might not be an issue. What do you think?
Can we just use start_range/end_range here instead?
I would rather not for three reasons:
NVTX3_V1_FUNC_RANGE
and NVTX3_V1_FUNC_RANGE_IN
represent a thread range, not a process range. So it would be weird to rely on domain_process_range
for the IF
variantIf we don't touch
domain_thread_range
and add themovable_domain_thread_range
this might not be an issue. What do you think?
Adding a whole additional type for this one use case is kind of a lot of code.
Adding a whole additional type for this one use case is kind of a lot of code.
We can create a base domain_thread_range_base
class which contains most of the shared code (relevant constructors & deleted operators). The domain_thread_range
and movable_domain_thread_range
could inherit from it and only implemented the pieces specific to their class (move ctor/operator, destructor)
I still don't love adding a new range type if we can avoid it. Here's how we can do it just by adding a tag type to pass to domain_thread_range
that indicates no range should be generated: https://godbolt.org/z/v9MYzc
Also, I'm wondering about if nvtx3.hpp
really needs to provide this macro. I'm not doubting if it's useful, I'm just questioning if it really needs to be provided by the library.
There are no doubt numerous variants of this macro some people would find useful, e.g., variants that allow specifying parts of the event_attributes
. The lack of overloading on macros makes this nasty as we then have to have different names for all of them. As a rule, I think nvtx3.hpp
should provide the bare minimum number of convenience macros to avoid bloat and complexity. The question is, do we think NVTX3_FUNC_RANGE_IF
and NVTX3_FUNC_RANGE_IF_IN
are essential enough to be considered bare minimum.
Also the naming of NVTX3_FUNC_RANGE_IF_IN
is potentially confusing. It sounds like the range would only be generated if it's in something.
The NVTX++ interface currently have the
NVTX3_FUNC_RANGE
andNVTX3_FUNC_RANGE_IN
macros which allow the generation of an NVTX range from the lifetime of the block.There are scenarios where we only want to conditionally generate NVTX annotations. For example, some developers might want to annotate their libraries but have some kind of verbosity control. In this case, they might want to control whether an annotation is emitted or not dynamically.
One possible implementation would be to add an additional class similar to
domain_thread_range
which would enable the move constructor. The macro implementation would then be the following:If the user wants the condition to only be evaluated once for the whole duration of the program execution, he can cache the result in a static variable.
The downside of making a class that allows a thread range to be movable is that it can allow misuse of the API. For e.g. a user might create such thread range an move it into a functor which is executed in another thread. If this is problematic, this class could be implemented into the
detail
namespace and documented to warn the users of those invalid cases.