dhui / passhash

Go library providing simple and secure password management
Apache License 2.0
8 stars 4 forks source link

Improve performance #10

Closed dhui closed 6 years ago

dhui commented 6 years ago

It appears that reflected function calls are slow. See: https://github.com/golang/go/issues/7818

Need to investigate alternatives to reflected function calls.

Supporting data

Collected profile from benchmarks

$ go test -cpuprofile cpu.prof -memprofile mem.prof -bench . -benchmem

CPU Profile

$ go tool pprof -top -cum cpu.prof
Type: cpu
Time: Sep 11, 2018 at 4:18pm (PDT)
Duration: 1.32mins, Total samples = 285.11s (361.04%)
Showing nodes accounting for 253.95s, 89.07% of 285.11s total
Dropped 410 nodes (cum <= 1.43s)
      flat  flat%   sum%        cum   cum%
   164.30s 57.63% 57.63%    164.31s 57.63%  runtime.usleep
     0.51s  0.18% 57.81%    162.27s 56.91%  runtime.lock
     0.02s 0.007% 57.81%    160.68s 56.36%  runtime.osyield
     0.10s 0.035% 57.85%     94.14s 33.02%  runtime.morestack
     1.26s  0.44% 58.29%     94.04s 32.98%  runtime.newstack
     0.23s 0.081% 58.37%     88.93s 31.19%  runtime.copystack
     0.10s 0.035% 58.41%     88.50s 31.04%  runtime.mcall
     0.14s 0.049% 58.45%     77.98s 27.35%  runtime.goexit0
     0.31s  0.11% 58.56%     55.08s 19.32%  runtime.stackfree
     0.19s 0.067% 58.63%     54.77s 19.21%  runtime.stackcacherelease
     0.32s  0.11% 58.74%     47.48s 16.65%  reflect.Value.Call
     3.27s  1.15% 59.89%     47.02s 16.49%  reflect.Value.call
     0.63s  0.22% 60.11%     46.06s 16.16%  runtime.schedule
     0.73s  0.26% 60.37%     42.99s 15.08%  github.com/dhui/thevent.(*Event).dispatch.func2
     0.59s  0.21% 60.57%     42.64s 14.96%  runtime.gfput
         0     0% 60.57%     34.59s 12.13%  testing.(*B).runN
         0     0% 60.57%     34.55s 12.12%  testing.(*B).launch
     0.28s 0.098% 60.67%     34.35s 12.05%  github.com/dhui/thevent.(*Event).dispatch
     0.17s  0.06% 60.73%     32.13s 11.27%  runtime.stackalloc
     0.07s 0.025% 60.76%     31.94s 11.20%  runtime.stackcacherefill
     4.28s  1.50% 62.26%     29.30s 10.28%  runtime.gentraceback
     0.13s 0.046% 62.30%     27.47s  9.63%  runtime.systemstack
     5.71s  2.00% 64.31%     22.91s  8.04%  reflect.implements
     0.08s 0.028% 64.33%     21.90s  7.68%  runtime.chansend1
     0.09s 0.032% 64.36%     21.82s  7.65%  runtime.chansend
         0     0% 64.36%     20.17s  7.07%  runtime.mstart
     0.01s 0.0035% 64.37%     19.07s  6.69%  runtime.newproc.func1
     0.29s   0.1% 64.47%     19.06s  6.69%  runtime.newproc1
     0.04s 0.014% 64.48%     17.94s  6.29%  runtime.findrunnable
     0.03s 0.011% 64.49%     15.51s  5.44%  github.com/dhui/thevent_test.BenchmarkEvents.func1.1
     0.01s 0.0035% 64.50%     15.47s  5.43%  github.com/dhui/thevent.(*Event).Dispatch
     1.59s  0.56% 65.06%     15.47s  5.43%  runtime.adjustframe
     0.02s 0.007% 65.06%     15.05s  5.28%  github.com/dhui/thevent_test.BenchmarkEvents.func1.2
     0.02s 0.007% 65.07%     15.02s  5.27%  github.com/dhui/thevent.(*Event).DispatchWithResults
     7.74s  2.71% 67.78%     14.87s  5.22%  runtime.pcvalue
     0.38s  0.13% 67.92%     14.77s  5.18%  reflect.(*rtype).AssignableTo
     0.52s  0.18% 68.10%     14.50s  5.09%  reflect.Value.assignTo
         0     0% 68.10%     14.24s  4.99%  runtime.semawakeup
     0.38s  0.13% 68.23%     13.82s  4.85%  runtime.unlock
     0.63s  0.22% 68.45%     12.50s  4.38%  runtime.gfget
     2.18s  0.76% 69.22%     11.76s  4.12%  runtime.getStackMap
     0.01s 0.0035% 69.22%     10.18s  3.57%  runtime.gfget.func1
     0.05s 0.018% 69.24%     10.05s  3.52%  runtime.park_m
     9.51s  3.34% 72.58%      9.51s  3.34%  runtime.pthread_cond_signal
     1.61s  0.56% 73.14%      8.74s  3.07%  reflect.(*rtype).typeOff
     0.43s  0.15% 73.29%      8.50s  2.98%  runtime.pcdatavalue
     0.62s  0.22% 73.51%      7.44s  2.61%  runtime.funcspdelta
     1.48s  0.52% 74.03%      7.13s  2.50%  reflect.resolveTypeOff
         0     0% 74.03%      7.04s  2.47%  runtime.semasleep
     1.46s  0.51% 74.54%      6.83s  2.40%  runtime.mallocgc
     2.53s  0.89% 75.43%      6.58s  2.31%  reflect.(*rtype).nameOff
     5.44s  1.91% 77.34%      6.11s  2.14%  runtime.step
     5.78s  2.03% 79.36%      5.78s  2.03%  runtime.pthread_cond_wait
     0.30s  0.11% 79.47%      5.74s  2.01%  runtime.runqput
     1.34s  0.47% 79.94%      5.65s  1.98%  runtime.(*_type).typeOff
     0.10s 0.035% 79.97%      5.32s  1.87%  runtime.runqputslow
         0     0% 79.97%      5.17s  1.81%  runtime.stopm
         0     0% 79.97%      4.91s  1.72%  runtime.notesleep
     4.54s  1.59% 81.57%      4.54s  1.59%  runtime.pthread_mutex_lock
     3.45s  1.21% 82.78%      4.31s  1.51%  runtime.resolveTypeOff
     0.12s 0.042% 82.82%      4.08s  1.43%  reflect.unsafe_New
     1.90s  0.67% 83.48%      4.05s  1.42%  reflect.resolveNameOff
     0.43s  0.15% 83.63%      3.99s  1.40%  reflect.directlyAssignable
         0     0% 83.63%      3.94s  1.38%  runtime.goschedImpl
     0.01s 0.0035% 83.64%      3.58s  1.26%  runtime.runqsteal
         0     0% 83.64%      3.57s  1.25%  runtime.gopreempt_m
     0.02s 0.007% 83.64%      3.57s  1.25%  runtime.runqgrab
     1.53s  0.54% 84.18%      3.53s  1.24%  reflect.(*rtype).Name
     0.09s 0.032% 84.21%      3.40s  1.19%  runtime.newproc
         0     0% 84.21%      3.30s  1.16%  runtime.gcstopm
     0.42s  0.15% 84.36%      3.02s  1.06%  reflect.funcLayout
     0.28s 0.098% 84.46%      2.75s  0.96%  runtime.makeslice
     0.47s  0.16% 84.62%      2.58s   0.9%  sync.(*Map).Load
     0.02s 0.007% 84.63%      2.48s  0.87%  github.com/dhui/thevent_test.BenchmarkEvents.func1.3
         0     0% 84.63%      2.45s  0.86%  github.com/dhui/thevent.(*Event).DispatchAsync
     1.89s  0.66% 85.29%      2.22s  0.78%  runtime.findfunc
     2.15s  0.75% 86.05%      2.15s  0.75%  runtime.resolveNameOff
     0.08s 0.028% 86.08%      2.12s  0.74%  github.com/dhui/thevent_test.BenchmarkEvents.func1.4.1
     2.01s   0.7% 86.78%      2.11s  0.74%  runtime.adjustpointers
     0.06s 0.021% 86.80%      2.04s  0.72%  runtime.chanrecv
         0     0% 86.80%      2.04s  0.72%  runtime.chanrecv2
         0     0% 86.80%      2.04s  0.72%  runtime.gcBgMarkWorker.func2
     0.21s 0.074% 86.88%      2.04s  0.72%  runtime.gcDrain
     0.94s  0.33% 87.20%         2s   0.7%  reflect.(*rtype).String
     0.72s  0.25% 87.46%         2s   0.7%  runtime.mapaccess2
         0     0% 87.46%      1.95s  0.68%  runtime.gcBgMarkWorker
     1.10s  0.39% 87.84%      1.62s  0.57%  runtime.heapBitsSetType
     0.38s  0.13% 87.98%      1.57s  0.55%  runtime.call64
     1.56s  0.55% 88.52%      1.56s  0.55%  reflect.name.name (inline)
     1.54s  0.54% 89.06%      1.54s  0.54%  runtime.memmove
     0.01s 0.0035% 89.07%      1.52s  0.53%  github.com/dhui/thevent_test.BenchmarkEvents.func1.4
         0     0% 89.07%      1.44s  0.51%  github.com/dhui/thevent.(*Event).DispatchAsyncWithResults
     0.01s 0.0035% 89.07%      1.44s  0.51%  runtime.gcAssistAlloc

Memory profile

$ go tool pprof -top -cum mem.prof
Type: alloc_space
Time: Sep 11, 2018 at 4:19pm (PDT)
Showing nodes accounting for 7010.28MB, 99.57% of 7040.50MB total
Dropped 29 nodes (cum <= 35.20MB)
      flat  flat%   sum%        cum   cum%
         0     0%     0%  6422.26MB 91.22%  reflect.Value.Call
 6421.25MB 91.20% 91.20%  6422.26MB 91.22%  reflect.Value.call
         0     0% 91.20%  3901.35MB 55.41%  testing.(*B).runN
         0     0% 91.20%  3894.65MB 55.32%  testing.(*B).launch
  392.02MB  5.57% 96.77%  3856.15MB 54.77%  github.com/dhui/thevent.(*Event).dispatch
   30.50MB  0.43% 97.21%  3096.12MB 43.98%  github.com/dhui/thevent.(*Event).dispatch.func2
    7.50MB  0.11% 97.31%  1841.07MB 26.15%  github.com/dhui/thevent_test.BenchmarkEvents.func1.1
         0     0% 97.31%  1833.57MB 26.04%  github.com/dhui/thevent.(*Event).Dispatch
       8MB  0.11% 97.43%  1738.57MB 24.69%  github.com/dhui/thevent_test.BenchmarkEvents.func1.2
         0     0% 97.43%  1730.57MB 24.58%  github.com/dhui/thevent.(*Event).DispatchWithResults
   12.50MB  0.18% 97.60%   198.01MB  2.81%  github.com/dhui/thevent_test.BenchmarkEvents.func1.4
         0     0% 97.60%   185.51MB  2.63%  github.com/dhui/thevent.(*Event).DispatchAsyncWithResults
   10.50MB  0.15% 97.75%      117MB  1.66%  github.com/dhui/thevent_test.BenchmarkEvents.func1.3
  107.50MB  1.53% 99.28%   107.50MB  1.53%  reflect.ValueOf (inline)
         0     0% 99.28%   106.50MB  1.51%  github.com/dhui/thevent.(*Event).DispatchAsync
         0     0% 99.28%    36.24MB  0.51%  runtime.mstart
   20.50MB  0.29% 99.57%    36.24MB  0.51%  runtime.systemstack
dhui commented 6 years ago

Whoops, opened issue in wrong repo... The issue actually belongs to https://github.com/dhui/thevent