Closed masahi closed 1 year ago
Generally looks good. Does the profiler work with the stateful API? It would be nice to have a test case using the stateful API over RPC with a tuple input/output to make sure everything we expect will be supported.
Some followup note. for runtime minimization we might want to have some extra option to optionally disable the profiler part, likely a macro guard is sufficient given code is sufficiently isolated
Hi @masahi, could you send the VM profiler to the unity branch? As this is a left-behind feature. Tracking issue #453
Hi @masahi, could you send the VM profiler to the unity branch? As this is a left-behind feature. Tracking issue #453
Sure I'll do that today
Adds per-op profiling support to Relax VM, in a way similar to how Relay VM is instrumented via the common profiling infra in the runtime.
Example output using DNNL BYOC, showing per-op timing:
Profiling also works over RPC, demonstrated below. This one uses the Relay translator and run the translated module without
FuseOps
.In addition to a call to packed func, Relay VM also instruments costly VM-specific ops like
AllocTensor
,DeviceCopy
etc. Equivalent instrumentations can be added to Relax VM if needed.@YuchenJin @tqchen @tkonolige @csullivan