Open TApplencourt opened 2 years ago
Hi @TApplencourt, thanks for your feedback! Please give me some time to dive deeply into this - first of all I'd like to understand how general is this problem. But even right now I agree that current kernel tracing for Level Zero looks over-complicated, and we probably need to simplify things somehow.
Hi @TApplencourt,
The problem you've reported is known for us, and indeed we don't support such a case in our tools for now. The reason - we are not aware of any customer application that uses such an approach. Do you face with this case in real life, or it's just a reproducer?
Of cause the lack of support right now doesn't mean we don't plan to add it. To deal with such a case zeCommandListAppendQueryKernelTimestamps function should be used. Note also, that having callback in Level Zero does not resolve this issue by itself since it's more about current Level Zero design. But yes, it can make customers lives easier by moving all the problems inside Level Zero rather than having them outside.
Currently we are thinking about an approach similar to CUPTI Activity, where one can be subscribed to some event (e.g. kernel invocation) to be notified asynchronously (with a callback) if this event happened. Do you believe this is something you would prefer to use?
Do you face with this case in real life, or it's just a reproducer?
Just a reproducer (for now :D)
To deal with such a case zeCommandListAppendQueryKernelTimestamps function should be used.
Oh yes, this required a little infrastructure (allocating device memory, handling offset, ...) but totally feasible indeed!
Currently we are thinking about an approach similar to CUPTI Activity, where one can be subscribed to some event (e.g. kernel invocation) to be notified asynchronously (with a callback) if this event happened. Do you believe this is something you would prefer to use?
Something around those lines sounds good! But I'm not by any means an expert, I will let @Kerilk write a more insightful reply.
Hi Anton,
@Kerilk and I are also developing a L0 tracer (https://github.com/argonne-lcf/THAPI). Recently we found that we don't handle the use case when a user resets an event with
zeCommandListAppendEventReset
. It looks like yourzetracer
has the same limitation (see the reproducer below).In our tool supporting such use case will be expensive with the current L0 spec. We asked many times for L0 to add native callbacks (also on event change). This should greatly reduce the implementation complexity and overhead of tracing.
For now, our feedback didn't get a lot of traction. Maybe if two independent teams implementing tracing in two different source codes need callbacks, L0 will be more inclined to add callbacks...
So the question is, do you think having callbacks will help
onetrace
?Reproducer
ze.cpp
kernel.cl
Compile
What we should expect?
We should expect
k1
to show the kernel execution. But we don't see itAnd if we run
k1
andk2
, we have timing for each kernel but they correspond only to k2Hope this help, Don't hesitate if you have any feedback.