Closed undertherain closed 2 years ago
Tested. See output below!
If newer python is installed generates:
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
Name Self CPU % Self CPU CPU total % CPU total CPU time avg Self CUDA Self CUDA % CUDA total CUDA time avg # of Calls
------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
aten::conv2d 0.04% 1.088ms 98.57% 2.497s 5.890ms 0.000us 0.00% 142.900ms 337.028us 424
aten::convolution 0.05% 1.222ms 98.52% 2.496s 5.888ms 0.000us 0.00% 142.900ms 337.028us 424
aten::_convolution 0.08% 2.051ms 98.48% 2.495s 5.885ms 0.000us 0.00% 142.900ms 337.028us 424
aten::cudnn_convolution 1.83% 46.312ms 98.40% 2.493s 5.880ms 142.900ms 87.49% 142.900ms 337.028us 424
cudaMemsetAsync 87.10% 2.207s 87.10% 2.207s 17.798ms 0.000us 0.00% 0.000us 0.000us 124
cudaEventSynchronize 3.96% 100.362ms 3.96% 100.362ms 737.956us 0.000us 0.00% 0.000us 0.000us 136
cudaMalloc 3.75% 95.117ms 3.75% 95.117ms 3.171ms 0.000us 0.00% 0.000us 0.000us 30
On F (or systems with older PyTorch):
$ head logs/inference/resnet50/unknown_CPU/pytorch_21.09.17_15.45.50.profile
{
"ClassifierInference": {
"net": {
"conv1": {
"null": {
"self_cpu_total": 45617.25099999998,
"cpu_total": 273811.58600000007,
"cuda_total": 0,
"occurrences": 1,
"param": "Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)"
use torch's internal profiler