Open bmaz-dev opened 1 month ago
Profiler results for CPU:
Name Self CPU % Self CPU CPU total % CPU total CPU time avg CPU Mem Self CPU Mem # of Calls
aten::empty 0.09% 23.884ms 0.09% 23.884ms 2.990us 11.58 Gb 11.58 Gb 7987
aten::mul 0.90% 239.991ms 0.91% 241.627ms 278.051us 1.82 Gb 1.82 Gb 869
aten::leaky_relu 0.45% 118.991ms 0.45% 118.991ms 1.545ms 1.79 Gb 1.79 Gb 77
aten::sigmoid 1.14% 302.637ms 1.14% 302.637ms 3.646ms 1.73 Gb 1.73 Gb 83
aten::add 0.59% 156.176ms 0.59% 157.522ms 261.664us 1.59 Gb 1.59 Gb 602
aten::elu 0.86% 229.634ms 0.86% 229.634ms 4.416ms 1.47 Gb 1.47 Gb 52
aten::cat 0.95% 251.466ms 0.99% 263.679ms 1.188ms 1.41 Gb 1.41 Gb 222
aten::resize_ 0.01% 2.036ms 0.01% 2.036ms 3.486us 496.96 Mb 496.96 Mb 584
aten::empty_strided 0.02% 4.963ms 0.02% 4.963ms 1.736us 322.03 Mb 322.03 Mb 2859
aten::div 0.03% 9.015ms 0.05% 13.573ms 10.353us 104.40 Mb 104.40 Mb 1311
aten::clamp_min 0.02% 4.391ms 0.02% 4.391ms 337.770us 42.88 Mb 42.88 Mb 13
aten::tanh 0.01% 3.825ms 0.01% 3.825ms 78.057us 34.22 Mb 34.22 Mb 49
aten::pow 0.01% 3.328ms 0.01% 3.434ms 10.131us 22.71 Mb 22.71 Mb 339
aten::replication_pad1d 0.01% 2.349ms 0.01% 2.349ms 587.299us 15.45 Mb 15.45 Mb 4
aten::sqrt 0.02% 4.387ms 0.02% 4.387ms 626.689us 12.90 Mb 12.90 Mb 7
aten::sub 0.00% 1.258ms 0.01% 1.666ms 32.667us 7.65 Mb 7.64 Mb 51
aten::reflection_pad1d 0.01% 1.919ms 0.01% 1.919ms 213.236us 6.73 Mb 6.73 Mb 9
aten::mean 0.00% 164.064us 0.03% 7.073ms 1.010ms 6.23 Mb 6.23 Mb 7
aten::exp 0.00% 766.632us 0.00% 766.632us 30.665us 5.98 Mb 5.98 Mb 25
aten::_softmax 0.00% 1.072ms 0.00% 1.072ms 536.178us 5.49 Mb 5.49 Mb 2
aten::sum 0.06% 16.173ms 0.07% 17.401ms 51.180us 5.43 Mb 5.43 Mb 340
aten::abs 0.00% 407.972us 0.00% 783.578us 97.947us 7.70 Mb 3.85 Mb 8
aten::atan2 0.06% 16.940ms 0.06% 16.940ms 8.470ms 3.74 Mb 3.74 Mb 2
aten::complex 0.00% 603.847us 0.00% 1.201ms 600.298us 7.47 Mb 3.74 Mb 2
aten::neg 0.00% 174.937us 0.00% 174.937us 34.987us 3.26 Mb 3.26 Mb 5
aten::cos 0.00% 1.099ms 0.00% 1.099ms 3.371us 2.49 Mb 2.49 Mb 326
aten::sin 0.00% 1.032ms 0.00% 1.032ms 3.165us 2.49 Mb 2.49 Mb 326
aten::eq 0.01% 2.302ms 0.01% 3.659ms 11.189us 1.09 Mb 1.09 Mb 327
aten::mm 0.00% 369.797us 0.00% 371.476us 185.738us 960.67 Kb 960.67 Kb 2
aten::gt 0.00% 338.914us 0.00% 388.025us 55.432us 702.98 Kb 702.95 Kb 7
aten::where 0.00% 1.311ms 0.01% 1.440ms 4.432us 640.60 Kb 640.60 Kb 325
aten::addmm 0.01% 1.707ms 0.01% 1.919ms 54.821us 214.00 Kb 214.00 Kb 35
aten::norm 0.01% 1.797ms 0.04% 11.568ms 55.886us 209.63 Kb 209.63 Kb 207
aten::arange 0.01% 1.414ms 0.01% 2.901ms 90.652us 16.11 Mb 144.62 Kb 32
aten::index_select 0.00% 30.440us 0.00% 42.725us 42.725us 8.00 Kb 8.00 Kb 1
aten::argmax 0.00% 351.202us 0.00% 354.223us 354.223us 1.87 Kb 1.87 Kb 1
aten::lt 0.00% 163.019us 0.00% 299.972us 42.853us 953 b 929 b 7
aten::_to_copy 0.02% 4.209ms 0.04% 10.485ms 4.345us 4.68 Mb 28 b 2413
aten::ceil 0.00% 48.521us 0.00% 48.521us 12.130us 16 b 16 b 4
aten::floor 0.00% 19.725us 0.00% 19.725us 4.931us 16 b 16 b 4
aten::detach 0.01% 2.021ms 0.02% 4.078ms 3.209us 0 b 0 b 1271
detach 0.01% 2.057ms 0.01% 2.057ms 1.631us 0 b 0 b 1261
aten::uniform_ 1.78% 473.168ms 1.78% 473.168ms 590.721us 0 b 0 b 801
aten::zeros 0.00% 153.719us 0.00% 593.052us 16.028us 4.05 Mb 0 b 37
aten::zero_ 0.00% 198.350us 0.00% 791.140us 5.651us 0 b 0 b 140
aten::ones 0.00% 46.252us 0.00% 87.798us 5.165us 6.02 Kb 0 b 17
aten::fill_ 0.42% 110.256ms 0.42% 110.301ms 237.207us 0 b 0 b 465
aten::to 0.01% 2.600ms 0.05% 13.086ms 2.989us 4.68 Mb 0 b 4378
aten::lift_fresh 0.00% 103.465us 0.00% 103.465us 0.282us 0 b 0 b 367
aten::detach_ 0.00% 417.550us 0.00% 538.588us 1.570us 0 b 0 b 343
detach_ 0.00% 121.038us 0.00% 121.038us 0.353us 0 b 0 b 343
aten::_has_compatible_shallow_copy_type 0.00% 171.775us 0.00% 171.775us 0.081us 0 b 0 b 2114
aten::copy_ 1.50% 397.472ms 1.50% 397.472ms 102.204us 0 b 0 b 3889
aten::cos_ 0.00% 471.539us 0.00% 471.539us 58.942us 0 b 0 b 8
aten::narrow 0.02% 6.120ms 0.05% 13.989ms 3.130us 0 b 0 b 4470
aten::slice 0.05% 13.627ms 0.06% 17.070ms 2.490us 0 b 0 b 6854
aten::as_strided 0.03% 7.582ms 0.03% 7.582ms 0.626us 0 b 0 b 12115
aten::set_ 0.00% 1.291ms 0.00% 1.291ms 1.177us 0 b 0 b 1096
aten::unsqueeze 0.01% 1.442ms 0.01% 2.186ms 3.842us 0 b 0 b 569
aten::squeeze 0.01% 1.477ms 0.01% 2.258ms 7.922us 0 b 0 b 285
aten::pad 0.00% 376.377us 0.83% 220.189ms 3.238ms 1.38 Gb 0 b 68
aten::constant_pad_nd 0.01% 1.512ms 0.81% 215.544ms 3.919ms 1.36 Gb 0 b 55
aten::view 0.01% 2.049ms 0.01% 2.049ms 3.986us 0 b 0 b 514
aten::_fft_r2c 0.02% 4.199ms 0.02% 4.403ms 489.253us 18.95 Mb 0 b 9
aten::permute 0.02% 6.368ms 0.03% 8.199ms 2.308us 0 b 0 b 3553
aten::reshape 0.00% 129.368us 0.01% 1.607ms 43.425us 11.56 Mb 0 b 37
aten::as_strided_ 0.01% 3.145ms 0.01% 3.145ms 9.678us 0 b 0 b 325
aten::transpose_ 0.00% 58.740us 0.00% 80.062us 7.278us 0 b 0 b 11
aten::real 0.00% 18.757us 0.00% 100.352us 50.176us 0 b 0 b 2
aten::view_as_real 0.00% 104.127us 0.00% 104.127us 7.438us 0 b 0 b 14
aten::select 0.00% 1.193ms 0.01% 1.355ms 9.279us 0 b 0 b 146
aten::imag 0.00% 8.806us 0.00% 30.240us 15.120us 0 b 0 b 2
aten::result_type 0.00% 70.758us 0.00% 70.758us 0.209us 0 b 0 b 339
aten::ones_like 0.00% 13.738us 0.00% 369.326us 184.663us 3.74 Mb 0 b 2
aten::empty_like 0.01% 2.921ms 0.03% 7.721ms 10.799us 2.57 Gb 0 b 715
aten::conv2d 0.01% 1.488ms 35.43% 9.408s 32.897ms 4.03 Gb 0 b 286
aten::convolution 0.02% 6.155ms 85.63% 22.741s 41.197ms 7.57 Gb 0 b 552
aten::contiguous 0.00% 802.573us 0.97% 258.745ms 976.396us 485.88 Mb 0 b 265
aten::clone 0.01% 1.868ms 1.10% 291.507ms 1.026ms 795.35 Mb 0 b 284
aten::mkldnn_convolution 52.78% 14.017s 52.82% 14.029s 26.370ms 7.21 Gb 0 b 532
aten::_batch_norm_impl_index 0.00% 237.198us 2.36% 627.862ms 36.933ms 1.76 Gb 0 b 17
aten::transpose 0.00% 268.824us 0.00% 346.326us 6.534us 0 b 0 b 53
aten::linear 0.00% 171.474us 0.01% 3.537ms 95.601us 1.15 Mb 0 b 37
aten::t 0.00% 136.998us 0.00% 258.435us 6.985us 0 b 0 b 37
aten::_reshape_alias 0.00% 11.899us 0.00% 11.899us 5.949us 0 b 0 b 2
aten::resolve_conj 0.00% 19.594us 0.00% 19.594us 0.236us 0 b 0 b 83
aten::_unsafe_view 0.00% 30.493us 0.00% 30.493us 6.099us 0 b 0 b 5
aten::repeat 0.00% 100.690us 0.06% 16.460ms 4.115ms 120.39 Mb 0 b 4
aten::expand 0.00% 133.376us 0.00% 170.158us 3.699us 0 b 0 b 46
aten::alias 0.00% 43.835us 0.00% 43.835us 5.479us 0 b 0 b 8
aten::unfold 0.00% 24.044us 0.00% 40.373us 3.106us 0 b 0 b 13
aten::expand_as 0.00% 5.205us 0.00% 13.783us 3.446us 0 b 0 b 4
aten::relu 0.00% 77.590us 0.02% 4.469ms 343.738us 42.88 Mb 0 b 13
aten::view_as_complex 0.00% 61.086us 0.00% 61.086us 12.217us 0 b 0 b 5
aten::_fft_c2r 0.01% 1.537ms 0.01% 1.611ms 536.940us 4.05 Mb 0 b 3
aten::min 0.00% 621.965us 0.00% 660.050us 110.008us 24 b 0 b 6
aten::new_ones 0.00% 16.831us 0.00% 29.109us 9.703us 3 b 0 b 3
aten::new_empty 0.00% 5.493us 0.00% 10.267us 3.422us 3 b 0 b 3
aten::equal 0.00% 34.230us 0.00% 36.643us 12.214us 0 b 0 b 3
aten::is_same_size 0.00% 2.413us 0.00% 2.413us 0.804us 0 b 0 b 3
aten::resolve_neg 0.00% 2.428us 0.00% 2.428us 0.347us 0 b 0 b 7
aten::dropout 0.00% 147.017us 0.00% 147.017us 2.535us 0 b 0 b 58
aten::norm_except_dim 0.00% 895.132us 0.05% 14.188ms 68.541us 209.63 Kb 0 b 207
aten::linalg_vector_norm 0.04% 9.544ms 0.04% 9.771ms 47.203us 0 b 0 b 207
aten::normal_ 0.27% 72.969ms 0.27% 72.969ms 715.384us 0 b 0 b 102
aten::numpy_T 0.00% 9.503us 0.00% 32.484us 8.121us 0 b 0 b 4
aten::clamp_ 0.00% 639.679us 0.00% 673.475us 2.072us 0 b 0 b 325
aten::stack 0.00% 205.585us 0.00% 913.244us 50.736us 690.10 Kb 0 b 18
aten::conv1d 0.01% 1.419ms 17.58% 4.670s 18.242ms 3.18 Gb 0 b 256
aten::item 0.00% 318.256us 0.00% 498.862us 5.422us 0 b 0 b 92
aten::_local_scalar_dense 0.00% 180.606us 0.00% 180.606us 1.963us 0 b 0 b 92
aten::_weight_norm_interface 0.10% 27.764ms 0.12% 32.844ms 82.109us 318.52 Mb 0 b 400
aten::lstm 0.00% 349.549us 3.19% 847.559ms 211.890ms 461.16 Mb 0 b 4
aten::cudnn_is_acceptable 0.00% 7.751us 0.00% 7.751us 1.292us 0 b 0 b 6
aten::randint 0.00% 39.300us 0.00% 64.351us 64.351us 128 b 0 b 1
aten::random_ 0.00% 21.591us 0.00% 21.591us 21.591us 0 b 0 b 1
aten::embedding 0.00% 32.862us 0.00% 84.085us 84.085us 8.00 Kb 0 b 1
aten::conv_transpose1d 0.00% 62.831us 32.63% 8.665s 866.535ms 363.38 Mb 0 b 10
aten::is_nonzero 0.00% 13.021us 0.00% 37.249us 6.208us 0 b 0 b 6
aten::max 0.00% 232.085us 0.00% 237.701us 79.234us 12 b 0 b 3
aten::unbind 0.00% 125.739us 0.00% 426.600us 71.100us 0 b 0 b 6
aten::unsafe_chunk 0.00% 84.807us 0.00% 1.136ms 19.590us 0 b 0 b 58
aten::unsafe_split 0.00% 282.191us 0.00% 1.051ms 18.128us 0 b 0 b 58
aten::sigmoid_ 0.00% 103.948us 0.00% 103.948us 1.792us 0 b 0 b 58
aten::tanh_ 0.00% 93.409us 0.00% 93.409us 3.221us 0 b 0 b 29
aten::zeros_like 0.00% 94.270us 0.00% 573.699us 30.195us 9.02 Mb 0 b 19
aten::_nnpack_available 0.00% 41.192us 0.00% 41.192us 4.119us 0 b 0 b 10
aten::thnn_conv2d 0.00% 33.124us 0.00% 989.102us 98.910us 74.00 Kb 0 b 10
aten::_slow_conv2d_forward 0.00% 703.227us 0.00% 955.978us 95.598us 74.00 Kb 0 b 10
defaults 0.00% 5.958us 0.00% 5.958us 0.993us 0 b 0 b 6
fused_add_tanh_sigmoid_multiply 0.01% 3.348ms 0.07% 19.409ms 404.357us 33.29 Mb 0 b 48
aten::split 0.00% 13.431us 0.00% 29.753us 29.753us 0 b 0 b 1
aten::randn_like 0.00% 5.509us 0.01% 1.375ms 1.375ms 710.25 Kb 0 b 1
aten::split_with_sizes 0.00% 78.100us 0.00% 88.051us 11.006us 0 b 0 b 8
aten::flip 0.00% 380.707us 0.00% 438.098us 54.762us 5.55 Mb 0 b 8
aten::leaky_relu_ 0.06% 14.666ms 0.06% 14.666ms 76.387us 0 b 0 b 192
aten::softmax 0.00% 43.512us 0.00% 1.116ms 557.934us 5.49 Mb 0 b 2
aten::count_nonzero 0.00% 157.384us 0.00% 162.405us 81.203us 16 b 0 b 2
aten::hann_window 0.00% 159.549us 0.01% 1.398ms 174.810us 43.47 Kb -8 b 8
aten::add_ 0.04% 11.679ms 0.04% 11.724ms 96.096us 8 b -24 b 122
aten::div_ 0.00% 755.837us 0.00% 900.473us 2.712us 0 b -28 b 332
aten::mul_ 0.00% 1.314ms 0.01% 3.284ms 7.763us 0 b -1.36 Kb 423
aten::batch_norm 0.00% 143.066us 2.36% 628.005ms 36.941ms 1.76 Gb -12.05 Kb 17
aten::layer_norm 0.00% 12.842us 0.00% 1.184ms 591.972us 3.57 Mb -14.26 Kb 2
aten::native_batch_norm 2.36% 626.984ms 2.36% 627.560ms 36.915ms 1.76 Gb -23.97 Kb 17
aten::mkldnn_rnn_layer 3.18% 845.059ms 3.18% 845.513ms 105.689ms 450.09 Mb -64.00 Kb 8
aten::gru 0.00% 640.703us 0.02% 4.656ms 2.328ms 15.50 Kb -116.00 Kb 2
aten::_weight_norm 0.01% 1.580ms 0.13% 34.424ms 86.060us 318.28 Mb -250.25 Kb 400
aten::native_layer_norm 0.00% 504.323us 0.00% 1.171ms 585.552us 3.59 Mb -3.57 Mb 2
aten::matmul 0.00% 38.210us 0.00% 1.098ms 548.934us 960.67 Kb -4.67 Mb 2
aten::unfold_backward 0.01% 3.266ms 0.02% 4.928ms 821.302us 4.01 Mb -8.02 Mb 6
aten::istft 0.00% 238.548us 0.03% 8.405ms 2.802ms 1.98 Mb -14.11 Mb 3
aten::stft 0.00% 225.875us 0.03% 6.657ms 739.694us 18.95 Mb -18.92 Mb 9
aten::_convolution 31.79% 8.444s 85.60% 22.735s 41.186ms 7.57 Gb -471.15 Mb 552
[memory] 0.00% 0.000us 0.00% 0.000us 0.000us -21.93 Gb -21.93 Gb 17675
Self CPU time total: 26.558s
To reproduce the table on your own add following to main function:
import torch.profiler as profiler
with profiler.profile(
activities=[torch.profiler.ProfilerActivity.CPU],
with_stack=True,profile_memory=True) as prof:
test_silentcipher()
audio_watermarking()
voice_cloning()
detecting_from_clone()
print(prof.key_averages().table(sort_by="self_cpu_memory_usage",row_limit=-1))
If you want to measure on GPU replace the profiler activity
Result: Killed, No file created. Logs: cd /project && python simple_example.py ckpt path or config path does not exist! Downloading the model from the Hugging Face Hub... enc_c.ckpt: 100%|██████████████████████████████████████████████████████████████████████████████████████| 170k/170k [00:00<00:00, 2.69MB/s] 16_khz/97561_iteration/hparams.yaml: 100%|███████████████████████████████████████████████████████████| 1.56k/1.56k [00:00<00:00, 24.6MB/s] .gitattributes: 100%|████████████████████████████████████████████████████████████████████████████████| 1.52k/1.52k [00:00<00:00, 22.4MB/s] enc_c.ckpt: 100%|██████████████████████████████████████████████████████████████████████████████████████| 185k/185k [00:00<00:00, 1.35MB/s] 44_1_khz/73999_iteration/hparams.yaml: 100%|█████████████████████████████████████████████████████████| 1.47k/1.47k [00:00<00:00, 23.3MB/s] README.md: 100%|█████████████████████████████████████████████████████████████████████████████████████| 7.79k/7.79k [00:00<00:00, 66.9MB/s] dec_c.ckpt: 100%|████████████████████████████████████████████████████████████████████████████████████| 2.01M/2.01M [00:01<00:00, 1.32MB/s] config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████| 51.0/51.0 [00:00<00:00, 715kB/s] opt.ckpt: 100%|██████████████████████████████████████████████████████████████████████████████████████| 23.4M/23.4M [00:03<00:00, 6.73MB/s] dec_m_0.ckpt: 100%|██████████████████████████████████████████████████████████████████████████████████| 9.55M/9.55M [00:04<00:00, 2.02MB/s] dec_c.ckpt: 100%|████████████████████████████████████████████████████████████████████████████████████| 2.01M/2.01M [00:00<00:00, 2.74MB/s] dec_m_0.ckpt: 100%|██████████████████████████████████████████████████████████████████████████████████| 9.54M/9.54M [00:05<00:00, 1.67MB/s] opt.ckpt: 100%|██████████████████████████████████████████████████████████████████████████████████████| 23.4M/23.4M [00:06<00:00, 3.63MB/s] Fetching 13 files: 100%|██████████████████████████████████████████████████████████████████████████████████| 13/13 [00:08<00:00, 1.60it/s] /usr/local/lib/python3.12/site-packages/silentcipher/server.py:444: FutureWarning: You are using
torch.load
withweights_only=False
(the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value forweights_only
will be flipped toTrue
. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user viatorch.serialization.add_safe_globals
. We recommend you start settingweights_only=True
for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. self.enc_c.load_state_dict(self.convert_dataparallel_to_normal(torch.load(os.path.join(ckpt_dir, "enc_c.ckpt"), map_location=self.device))) /usr/local/lib/python3.12/site-packages/silentcipher/server.py:445: FutureWarning: You are usingtorch.load
withweights_only=False
(the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value forweights_only
will be flipped toTrue
. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user viatorch.serialization.add_safe_globals
. We recommend you start settingweights_only=True
for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. self.dec_c.load_state_dict(self.convert_dataparallel_to_normal(torch.load(os.path.join(ckpt_dir, "dec_c.ckpt"), map_location=self.device))) /usr/local/lib/python3.12/site-packages/silentcipher/server.py:447: FutureWarning: You are usingtorch.load
withweights_only=False
(the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value forweights_only
will be flipped toTrue
. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user viatorch.serialization.add_safe_globals
. We recommend you start settingweights_only=True
for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. m.load_state_dict(self.convert_dataparallel_to_normal(torch.load(os.path.join(ckpt_dir, f"decm{i}.ckpt"), map_location=self.device))) Using the default SDR of 47 dB Killed