mandiant / GoReSym

Go symbol recovery tool
MIT License
498 stars 62 forks source link

objfile: optimize findAllOccurrences #26

Closed williballenthin closed 11 months ago

williballenthin commented 11 months ago

in this hot routine, rely on the stdlib bytes.Index rather than handrolling a memfind routine. its much faster.

before:

❯ time ./GoReSym 11d7cb5750c44c40e767c7c4fa0f388ed64f636cf6b97ceee7bf9ae77683a3c0 > /dev/null
2023/07/21 13:18:58 profile: cpu profiling enabled, cpu.pprof
GoReSym: profile: cpu profiling disabled, cpu.pprof

________________________________________________________
Executed in   15.20 secs    fish           external
   usr time   15.01 secs  151.00 micros   15.01 secs
   sys time    0.87 secs  136.00 micros    0.87 secs

❯ pprof -cum -top cpu.pprof
File: GoReSym
Type: cpu
Time: Jul 21, 2023 at 1:18pm (CEST)
Duration: 15.19s, Total samples = 15.42s (101.50%)
Showing nodes accounting for 14.99s, 97.21% of 15.42s total
Dropped 88 nodes (cum <= 0.08s)
      flat  flat%   sum%        cum   cum%
         0     0%     0%     14.67s 95.14%  main.main
         0     0%     0%     14.67s 95.14%  runtime.main
     0.01s 0.065% 0.065%     14.40s 93.39%  main.main_impl
         0     0% 0.065%     12.78s 82.88%  github.com/mandiant/GoReSym/objfile.(*Entry).PCLineTable
         0     0% 0.065%     12.78s 82.88%  github.com/mandiant/GoReSym/objfile.(*File).PCLineTable (inline)
         0     0% 0.065%     12.65s 82.04%  github.com/mandiant/GoReSym/objfile.(*peFile).pcln
     0.29s  1.88%  1.95%     12.65s 82.04%  github.com/mandiant/GoReSym/objfile.(*peFile).pcln_scan
     4.57s 29.64% 31.58%     12.14s 78.73%  github.com/mandiant/GoReSym/objfile.findAllOccurrences (inline)
     1.03s  6.68% 38.26%      7.57s 49.09%  bytes.Equal (inline)
     5.48s 35.54% 73.80%      5.48s 35.54%  memeqbody
         0     0% 73.80%      1.47s  9.53%  github.com/mandiant/GoReSym/objfile.(*Entry).ModuleDataTable
         0     0% 73.80%      1.47s  9.53%  github.com/mandiant/GoReSym/objfile.(*File).ModuleDataTable
         0     0% 73.80%      1.47s  9.53%  github.com/mandiant/GoReSym/objfile.(*peFile).moduledata_scan
         0     0% 73.80%      1.39s  9.01%  github.com/mandiant/GoReSym/debug/pe.(*File).DataAfterSection
     1.06s  6.87% 80.67%      1.06s  6.87%  runtime.memequal
     0.79s  5.12% 85.80%      0.79s  5.12%  runtime.memmove
         0     0% 85.80%      0.79s  5.12%  runtime.systemstack
         0     0% 85.80%      0.72s  4.67%  runtime.gcBgMarkWorker
         0     0% 85.80%      0.72s  4.67%  runtime.gcBgMarkWorker.func2
         0     0% 85.80%      0.72s  4.67%  runtime.gcDrain
         0     0% 85.80%      0.56s  3.63%  github.com/mandiant/GoReSym/debug/pe.(*Section).Data
         0     0% 85.80%      0.53s  3.44%  runtime.growslice   

after:

❯ time ./GoReSym 11d7cb5750c44c40e767c7c4fa0f388ed64f636cf6b97ceee7bf9ae77683a3c0 > /dev/null
2023/07/21 13:20:03 profile: cpu profiling enabled, cpu.pprof
GoReSym: profile: cpu profiling disabled, cpu.pprof

________________________________________________________
Executed in    3.02 secs    fish           external
   usr time    2.87 secs  204.00 micros    2.87 secs
   sys time    0.82 secs  176.00 micros    0.82 secs

❯ pprof -cum -top cpu.pprof
File: GoReSym
Type: cpu
Time: Jul 21, 2023 at 1:20pm (CEST)
Duration: 3.02s, Total samples = 3.30s (109.33%)
Showing nodes accounting for 3.10s, 93.94% of 3.30s total
Dropped 68 nodes (cum <= 0.02s)
      flat  flat%   sum%        cum   cum%
         0     0%     0%      2.51s 76.06%  main.main
         0     0%     0%      2.51s 76.06%  runtime.main
         0     0%     0%      2.24s 67.88%  main.main_impl
         0     0%     0%      1.39s 42.12%  github.com/mandiant/GoReSym/debug/pe.(*File).DataAfterSection
         0     0%     0%      1.34s 40.61%  github.com/mandiant/GoReSym/objfile.(*Entry).ModuleDataTable
         0     0%     0%      1.34s 40.61%  github.com/mandiant/GoReSym/objfile.(*File).ModuleDataTable
         0     0%     0%      1.34s 40.61%  github.com/mandiant/GoReSym/objfile.(*peFile).moduledata_scan
     0.97s 29.39% 29.39%      0.97s 29.39%  runtime.memmove
         0     0% 29.39%      0.78s 23.64%  github.com/mandiant/GoReSym/objfile.(*Entry).PCLineTable
         0     0% 29.39%      0.78s 23.64%  github.com/mandiant/GoReSym/objfile.(*File).PCLineTable (inline)
         0     0% 29.39%      0.75s 22.73%  runtime.systemstack
         0     0% 29.39%      0.73s 22.12%  runtime.gcBgMarkWorker
         0     0% 29.39%      0.73s 22.12%  runtime.gcBgMarkWorker.func2
         0     0% 29.39%      0.73s 22.12%  runtime.gcDrain
         0     0% 29.39%      0.64s 19.39%  github.com/mandiant/GoReSym/objfile.(*peFile).pcln
         0     0% 29.39%      0.64s 19.39%  github.com/mandiant/GoReSym/objfile.(*peFile).pcln_scan
     0.10s  3.03% 32.42%      0.60s 18.18%  bytes.Index
         0     0% 32.42%      0.57s 17.27%  runtime.growslice