sourcegraph / appdash

Application tracing system for Go, based on Google's Dapper.
https://sourcegraph.com
Other
1.72k stars 137 forks source link

Improve ChunkedCollector performance slightly #80

Closed slimsag closed 9 years ago

slimsag commented 9 years ago

Summary

This change improves performance of ChunkedCollector slightly, but also highlights an important issue (which I strongly believe will improve in Go 1.5, hopefully I will have time to test with 1.5 soon to verify). For now see the workaround mentioned in Solution section 2.

(also see relevant tests and data at Appdash PR #80)

Presumptions

I believe that ChunkedCollector when placing items into it's underlying map (which holds onto them for < 500ms in our prod. apps, and just 10ms in our tests) is indirectly causing 20-41k tiny heap objects to be allocated (but the Go code is NOT leaking them) and that the Go runtime is poorly equipped in Go 1.4 to either:

  1. Reuse these small objects behind the map so that many allocations do not need to occur.
  2. Release the memory allocated by these small objects back to the OS (seen as top RES above).

    History

I have also observed results similiar to this before in another project of mine (which unlike the case here, actually allocated several hundred thousand small objects per second), leaving me with ~12ms GC times on average. I believe these issues to be one in the same and likely fixed or improved significantly with the Go 1.5 GC improvements.

Analysis:

In order to fix this issue i have looked at two things as a solution:

  1. Attempt to lower the amount of heap objects that Appdash ChunkedCollector must allocate.
    • This is not possible, because (after PR #80 is merged) it is literally just a very simple map and is being cleared to nil upon each ChunkedCollector.Flush.
    • Using anything other than a map (e.g. a slice) is probably not viable because of the number of spans that will end up in the map, we would effectively be trading memory usage for CPU usage. Not good.
    • However, I was able to identify and remove (in PR #80) a single slice that was uneeded, this cuts the total allocations nearly in half.
  2. Setup a background goroutine that occasionally invokes runtime.GC twice (needed to trigger a full GC) followed by one FreeOSMemory call.
    • Go by default triggers a GC only after the heap has grown 100%, which is normally fine, but in this case it appears to be unable to recollect these small objects and reuse them, so the heap grows nearly unbounded in size.
    • It is likely this issue would appear in apps today as a unbounded heap growth up to entire memory consumption, followed by either a (very) long GC or an OOM runtime panic.
    • A good value for which people may call these three functions in sequence might be every 5-30 minutes, just enough so that these small allocations are returned to the host OS (and can be reqaquired by Go subsequently) before the heap grows too large, but not such a quick value that it would hinder performance.

Example:

func init() {
    // Workaround until Go 1.5 is released, as described at:
    //
    //  https://github.com/sourcegraph/appdash/pull/80
    //
    go func() {
        for{
            time.Sleep(5 * time.Minute)
            runtime.GC()
            runtime.GC()
            debug.FreeOSMemory()
        }
    }()
}
slimsag commented 9 years ago

Index / Matrix (of test data below)

Link To Test call GC call FreeOSMemory top VIRT top RES HeapObjects (pre, post, post-GC-Free) TotalAlloc
Test 1 Yes Yes 270.6m 19.2m (530, 29733, 548) 314MB
Test 2 No No 245.7m 158.9m (530, 22295, No GC) 351MB
Test 3 Yes No 473.3m 314.9m (527, 41744, 4832) 356MB
Test 4 Yes No 241.0m 210.8m (528, 20928, 9676) 345MB
Test 5 Yes Yes 215.4m 19.7m (524, 38667, 545) 224MB

Note: CI failure is unrelated to this change -- I've also tested locally and it passes all tests fine.

slimsag commented 9 years ago

Test 1

Test Feature Enabled
10 explicit runtime.GC calls? Yes
10 explicit runtime/debug.FreeOSMemory calls? Yes
$ go test -tags=danger -memprofilerate=1 -memprofile=mem.out -bench=BenchmarkChunkedCollector1mil -run=NONE -v
PASS
BenchmarkChunkedCollector1mil   

[pre] 1000000-collections:
  General statistics
    - Alloc       82KB   (allocated and still in use)                  
    - TotalAlloc  863KB  (allocated (even if freed))                   
    - Sys         2MB    (obtained from system (sum of XxxSys below))  
    - Lookups     5      (number of pointer lookups)                   
    - Mallocs     15976  (number of mallocs)                           
    - Frees       15446  (number of frees)                             

  Main allocation heap statistics
    - HeapAlloc     82KB   (allocated and still in use)         
    - HeapSys       848KB  (obtained from system)               
    - HeapIdle      464KB  (in idle spans)                      
    - HeapInuse     384KB  (in non-idle span)                   
    - HeapReleased  0B     (released to the OS)                 
    - HeapObjects   530    (total number of allocated objects)  

  Low-level fixed-size structure allocator statistics
    - StackInuse   176KB  (used by stack allocator right now)                   
    - StackSys     176KB  (obtained from system)                                
    - MSpanInuse   4KB    (mspan structures / in use now)                       
    - MSpanSys     16KB   (obtained from system)                                
    - MCacheInuse  1KB    (in use now))                                         
    - MCacheSys    16KB   (mcache structures / obtained from system)            
    - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
    - GCSys        70KB   (GC metadata / obtained form system)                  
    - OtherSys     269KB  (other system allocations)                            

[post] 1000000-collections:
    General statistics
      - Alloc       163MB  (allocated and still in use)                  
      - TotalAlloc  314MB  (allocated (even if freed))                   
      - Sys         178MB  (obtained from system (sum of XxxSys below))  
      - Lookups     5      (number of pointer lookups)                   
      - Mallocs     75075  (number of mallocs)                           
      - Frees       45342  (number of frees)                             

    Main allocation heap statistics
      - HeapAlloc     163MB  (allocated and still in use)         
      - HeapSys       165MB  (obtained from system)               
      - HeapIdle      1MB    (in idle spans)                      
      - HeapInuse     164MB  (in non-idle span)                   
      - HeapReleased  0B     (released to the OS)                 
      - HeapObjects   29733  (total number of allocated objects)  

    Low-level fixed-size structure allocator statistics
      - StackInuse   176KB  (used by stack allocator right now)                   
      - StackSys     176KB  (obtained from system)                                
      - MSpanInuse   165KB  (mspan structures / in use now)                       
      - MSpanSys     176KB  (obtained from system)                                
      - MCacheInuse  1KB    (in use now))                                         
      - MCacheSys    16KB   (mcache structures / obtained from system)            
      - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
      - GCSys        10MB   (GC metadata / obtained form system)                  
      - OtherSys     1MB    (other system allocations)                            

[garbage collections]
  0. GC - 12.675495ms
  1. GC - 9.219848ms
  2. GC - 8.557734ms
  3. GC - 8.729595ms
  4. GC - 8.574088ms
  5. GC - 8.921975ms
  6. GC - 8.551715ms
  7. GC - 8.605256ms
  8. GC - 8.704241ms
  9. GC - 990.798µs

[free os memory]
  0. FreeOSMemory - 2.886292ms
  1. FreeOSMemory - 271.009µs
  2. FreeOSMemory - 228.818µs
  3. FreeOSMemory - 218.233µs
  4. FreeOSMemory - 214.515µs
  5. FreeOSMemory - 213.408µs
  6. FreeOSMemory - 213.43µs
  7. FreeOSMemory - 222.377µs
  8. FreeOSMemory - 222.013µs
  9. FreeOSMemory - 215.682µs

[after GC & FreeOSMemory] 1000000-collections:
      General statistics
        - Alloc       81KB   (allocated and still in use)                  
        - TotalAlloc  314MB  (allocated (even if freed))                   
        - Sys         178MB  (obtained from system (sum of XxxSys below))  
        - Lookups     5      (number of pointer lookups)                   
        - Mallocs     76490  (number of mallocs)                           
        - Frees       75942  (number of frees)                             

      Main allocation heap statistics
        - HeapAlloc     81KB   (allocated and still in use)         
        - HeapSys       165MB  (obtained from system)               
        - HeapIdle      165MB  (in idle spans)                      
        - HeapInuse     376KB  (in non-idle span)                   
        - HeapReleased  165MB  (released to the OS)                 
        - HeapObjects   548    (total number of allocated objects)  

      Low-level fixed-size structure allocator statistics
        - StackInuse   176KB  (used by stack allocator right now)                   
        - StackSys     176KB  (obtained from system)                                
        - MSpanInuse   4KB    (mspan structures / in use now)                       
        - MSpanSys     176KB  (obtained from system)                                
        - MCacheInuse  1KB    (in use now))                                         
        - MCacheSys    16KB   (mcache structures / obtained from system)            
        - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
        - GCSys        10MB   (GC metadata / obtained form system)                  
        - OtherSys     1MB    (other system allocations)                            

sleeping 30s; check memory usage with 'top -p 4129' now       1 30934787484 ns/op
ok      sourcegraph.com/sourcegraph/appdash 30.982s

top

$ top -p 4129

top - 13:45:03 up 2 days, 10:52,  9 users,  load average: 0.23, 0.37, 0.53
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.7 us,  1.5 sy,  0.0 ni, 96.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   8088044 total,  7269104 used,   818940 free,   194556 buffers
KiB Swap: 15625212 total,   975200 used, 14650012 free.  1373772 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                          
 4129 stephen   20   0  270.6m  19.2m   5.7m S   2.6  0.2   0:01.18 appdash.test                     
slimsag commented 9 years ago

Test 2

Test Feature Enabled
10 explicit runtime.GC calls? No
10 explicit runtime/debug.FreeOSMemory calls? No
$ go test -tags=danger -memprofilerate=1 -memprofile=mem.out -bench=BenchmarkChunkedCollector1mil -run=NONE -v
PASS
BenchmarkChunkedCollector1mil   

[pre] 1000000-collections:
  General statistics
    - Alloc       81KB   (allocated and still in use)                  
    - TotalAlloc  864KB  (allocated (even if freed))                   
    - Sys         2MB    (obtained from system (sum of XxxSys below))  
    - Lookups     5      (number of pointer lookups)                   
    - Mallocs     15978  (number of mallocs)                           
    - Frees       15448  (number of frees)                             

  Main allocation heap statistics
    - HeapAlloc     81KB   (allocated and still in use)         
    - HeapSys       832KB  (obtained from system)               
    - HeapIdle      448KB  (in idle spans)                      
    - HeapInuse     384KB  (in non-idle span)                   
    - HeapReleased  0B     (released to the OS)                 
    - HeapObjects   530    (total number of allocated objects)  

  Low-level fixed-size structure allocator statistics
    - StackInuse   192KB  (used by stack allocator right now)                   
    - StackSys     192KB  (obtained from system)                                
    - MSpanInuse   4KB    (mspan structures / in use now)                       
    - MSpanSys     16KB   (obtained from system)                                
    - MCacheInuse  1KB    (in use now))                                         
    - MCacheSys    16KB   (mcache structures / obtained from system)            
    - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
    - GCSys        70KB   (GC metadata / obtained form system)                  
    - OtherSys     269KB  (other system allocations)                            

[post] 1000000-collections:
    General statistics
      - Alloc       137MB  (allocated and still in use)                  
      - TotalAlloc  351MB  (allocated (even if freed))                   
      - Sys         154MB  (obtained from system (sum of XxxSys below))  
      - Lookups     5      (number of pointer lookups)                   
      - Mallocs     79463  (number of mallocs)                           
      - Frees       57168  (number of frees)                             

    Main allocation heap statistics
      - HeapAlloc     137MB  (allocated and still in use)         
      - HeapSys       142MB  (obtained from system)               
      - HeapIdle      4MB    (in idle spans)                      
      - HeapInuse     137MB  (in non-idle span)                   
      - HeapReleased  0B     (released to the OS)                 
      - HeapObjects   22295  (total number of allocated objects)  

    Low-level fixed-size structure allocator statistics
      - StackInuse   192KB  (used by stack allocator right now)                   
      - StackSys     192KB  (obtained from system)                                
      - MSpanInuse   125KB  (mspan structures / in use now)                       
      - MSpanSys     128KB  (obtained from system)                                
      - MCacheInuse  1KB    (in use now))                                         
      - MCacheSys    16KB   (mcache structures / obtained from system)            
      - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
      - GCSys        9MB    (GC metadata / obtained form system)                  
      - OtherSys     910KB  (other system allocations)                            

sleeping 30s; check memory usage with 'top -p 4279' now       1 30881045987 ns/op
ok      sourcegraph.com/sourcegraph/appdash 30.919s

top

$ top -p 4279

top - 13:50:39 up 2 days, 10:57,  9 users,  load average: 0.32, 0.38, 0.49
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.7 us,  1.0 sy,  0.0 ni, 96.4 id,  0.9 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   8088044 total,  7513144 used,   574900 free,   195200 buffers
KiB Swap: 15625212 total,   975132 used, 14650080 free.  1394584 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                          
 4279 stephen   20   0  245.7m 158.9m   5.6m S   2.0  2.0   0:01.09 appdash.test                     
slimsag commented 9 years ago

Test 3

Test Feature Enabled
10 explicit runtime.GC calls? Yes
10 explicit runtime/debug.FreeOSMemory calls? No
$ go test -tags=danger -memprofilerate=1 -memprofile=mem.out -bench=BenchmarkChunkedCollector1mil -run=NONE -v
PASS
BenchmarkChunkedCollector1mil   

[pre] 1000000-collections:
  General statistics
    - Alloc       80KB   (allocated and still in use)                  
    - TotalAlloc  863KB  (allocated (even if freed))                   
    - Sys         2MB    (obtained from system (sum of XxxSys below))  
    - Lookups     5      (number of pointer lookups)                   
    - Mallocs     15973  (number of mallocs)                           
    - Frees       15446  (number of frees)                             

  Main allocation heap statistics
    - HeapAlloc     80KB   (allocated and still in use)         
    - HeapSys       832KB  (obtained from system)               
    - HeapIdle      456KB  (in idle spans)                      
    - HeapInuse     376KB  (in non-idle span)                   
    - HeapReleased  0B     (released to the OS)                 
    - HeapObjects   527    (total number of allocated objects)  

  Low-level fixed-size structure allocator statistics
    - StackInuse   192KB  (used by stack allocator right now)                   
    - StackSys     192KB  (obtained from system)                                
    - MSpanInuse   4KB    (mspan structures / in use now)                       
    - MSpanSys     16KB   (obtained from system)                                
    - MCacheInuse  1KB    (in use now))                                         
    - MCacheSys    16KB   (mcache structures / obtained from system)            
    - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
    - GCSys        70KB   (GC metadata / obtained form system)                  
    - OtherSys     268KB  (other system allocations)                            

[post] 1000000-collections:
    General statistics
      - Alloc       214MB  (allocated and still in use)                  
      - TotalAlloc  356MB  (allocated (even if freed))                   
      - Sys         309MB  (obtained from system (sum of XxxSys below))  
      - Lookups     5      (number of pointer lookups)                   
      - Mallocs     76219  (number of mallocs)                           
      - Frees       34475  (number of frees)                             

    Main allocation heap statistics
      - HeapAlloc     214MB  (allocated and still in use)         
      - HeapSys       288MB  (obtained from system)               
      - HeapIdle      73MB   (in idle spans)                      
      - HeapInuse     214MB  (in non-idle span)                   
      - HeapReleased  0B     (released to the OS)                 
      - HeapObjects   41744  (total number of allocated objects)  

    Low-level fixed-size structure allocator statistics
      - StackInuse   192KB  (used by stack allocator right now)                   
      - StackSys     192KB  (obtained from system)                                
      - MSpanInuse   225KB  (mspan structures / in use now)                       
      - MSpanSys     240KB  (obtained from system)                                
      - MCacheInuse  1KB    (in use now))                                         
      - MCacheSys    16KB   (mcache structures / obtained from system)            
      - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
      - GCSys        18MB   (GC metadata / obtained form system)                  
      - OtherSys     1MB    (other system allocations)                            

[garbage collections]
  0. GC - 62.225159ms
  1. GC - 56.665619ms
  2. GC - 89.450751ms
  3. GC - 65.649404ms
  4. GC - 57.154845ms
  5. GC - 58.463406ms
  6. GC - 56.052632ms
  7. GC - 56.512408ms
  8. GC - 57.36048ms
  9. GC - 75.21313ms

[after GC] 1000000-collections:
      General statistics
        - Alloc       128MB  (allocated and still in use)                  
        - TotalAlloc  356MB  (allocated (even if freed))                   
        - Sys         309MB  (obtained from system (sum of XxxSys below))  
        - Lookups     5      (number of pointer lookups)                   
        - Mallocs     77492  (number of mallocs)                           
        - Frees       72660  (number of frees)                             

      Main allocation heap statistics
        - HeapAlloc     128MB  (allocated and still in use)         
        - HeapSys       288MB  (obtained from system)               
        - HeapIdle      159MB  (in idle spans)                      
        - HeapInuse     128MB  (in non-idle span)                   
        - HeapReleased  0B     (released to the OS)                 
        - HeapObjects   4832   (total number of allocated objects)  

      Low-level fixed-size structure allocator statistics
        - StackInuse   192KB  (used by stack allocator right now)                   
        - StackSys     192KB  (obtained from system)                                
        - MSpanInuse   28KB   (mspan structures / in use now)                       
        - MSpanSys     240KB  (obtained from system)                                
        - MCacheInuse  1KB    (in use now))                                         
        - MCacheSys    16KB   (mcache structures / obtained from system)            
        - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
        - GCSys        18MB   (GC metadata / obtained form system)                  
        - OtherSys     1MB    (other system allocations)                            

sleeping 30s; check memory usage with 'top -p 4362' now       1 31641170518 ns/op
ok      sourcegraph.com/sourcegraph/appdash 31.684s

top

$ top -p 4362

top - 13:55:41 up 2 days, 11:02,  9 users,  load average: 0.24, 0.33, 0.44
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.6 us,  0.8 sy,  0.0 ni, 97.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   8088044 total,  7761588 used,   326456 free,   195860 buffers
KiB Swap: 15625212 total,   974988 used, 14650224 free.  1395596 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                          
 4362 stephen   20   0  473.3m 314.9m   5.7m S   2.1  4.0   0:01.98 appdash.test                     
slimsag commented 9 years ago

Test 4 (repeat of test 3)

Test Feature Enabled
10 explicit runtime.GC calls? Yes
10 explicit runtime/debug.FreeOSMemory calls? No
$ go test -tags=danger -memprofilerate=1 -memprofile=mem.out -bench=BenchmarkChunkedCollector1mil -run=NONE -v
PASS
BenchmarkChunkedCollector1mil   

[pre] 1000000-collections:
  General statistics
    - Alloc       80KB   (allocated and still in use)                  
    - TotalAlloc  862KB  (allocated (even if freed))                   
    - Sys         2MB    (obtained from system (sum of XxxSys below))  
    - Lookups     5      (number of pointer lookups)                   
    - Mallocs     15974  (number of mallocs)                           
    - Frees       15446  (number of frees)                             

  Main allocation heap statistics
    - HeapAlloc     80KB   (allocated and still in use)         
    - HeapSys       864KB  (obtained from system)               
    - HeapIdle      480KB  (in idle spans)                      
    - HeapInuse     384KB  (in non-idle span)                   
    - HeapReleased  0B     (released to the OS)                 
    - HeapObjects   528    (total number of allocated objects)  

  Low-level fixed-size structure allocator statistics
    - StackInuse   160KB  (used by stack allocator right now)                   
    - StackSys     160KB  (obtained from system)                                
    - MSpanInuse   4KB    (mspan structures / in use now)                       
    - MSpanSys     16KB   (obtained from system)                                
    - MCacheInuse  1KB    (in use now))                                         
    - MCacheSys    16KB   (mcache structures / obtained from system)            
    - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
    - GCSys        70KB   (GC metadata / obtained form system)                  
    - OtherSys     268KB  (other system allocations)                            

[post] 1000000-collections:
    General statistics
      - Alloc       127MB  (allocated and still in use)                  
      - TotalAlloc  345MB  (allocated (even if freed))                   
      - Sys         205MB  (obtained from system (sum of XxxSys below))  
      - Lookups     5      (number of pointer lookups)                   
      - Mallocs     76171  (number of mallocs)                           
      - Frees       55243  (number of frees)                             

    Main allocation heap statistics
      - HeapAlloc     127MB  (allocated and still in use)         
      - HeapSys       190MB  (obtained from system)               
      - HeapIdle      62MB   (in idle spans)                      
      - HeapInuse     128MB  (in non-idle span)                   
      - HeapReleased  0B     (released to the OS)                 
      - HeapObjects   20928  (total number of allocated objects)  

    Low-level fixed-size structure allocator statistics
      - StackInuse   192KB  (used by stack allocator right now)                   
      - StackSys     192KB  (obtained from system)                                
      - MSpanInuse   114KB  (mspan structures / in use now)                       
      - MSpanSys     192KB  (obtained from system)                                
      - MCacheInuse  1KB    (in use now))                                         
      - MCacheSys    16KB   (mcache structures / obtained from system)            
      - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
      - GCSys        12MB   (GC metadata / obtained form system)                  
      - OtherSys     1MB    (other system allocations)                            

[garbage collections]
  0. GC - 51.791745ms
  1. GC - 44.511702ms
  2. GC - 45.125178ms
  3. GC - 46.553299ms
  4. GC - 46.943978ms
  5. GC - 43.940203ms
  6. GC - 44.085764ms
  7. GC - 43.204175ms
  8. GC - 43.587524ms
  9. GC - 43.648594ms

[after GC] 1000000-collections:
      General statistics
        - Alloc       89MB   (allocated and still in use)                  
        - TotalAlloc  345MB  (allocated (even if freed))                   
        - Sys         205MB  (obtained from system (sum of XxxSys below))  
        - Lookups     5      (number of pointer lookups)                   
        - Mallocs     77444  (number of mallocs)                           
        - Frees       67768  (number of frees)                             

      Main allocation heap statistics
        - HeapAlloc     89MB   (allocated and still in use)         
        - HeapSys       190MB  (obtained from system)               
        - HeapIdle      98MB   (in idle spans)                      
        - HeapInuse     92MB   (in non-idle span)                   
        - HeapReleased  0B     (released to the OS)                 
        - HeapObjects   9676   (total number of allocated objects)  

      Low-level fixed-size structure allocator statistics
        - StackInuse   192KB  (used by stack allocator right now)                   
        - StackSys     192KB  (obtained from system)                                
        - MSpanInuse   81KB   (mspan structures / in use now)                       
        - MSpanSys     192KB  (obtained from system)                                
        - MCacheInuse  1KB    (in use now))                                         
        - MCacheSys    16KB   (mcache structures / obtained from system)            
        - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
        - GCSys        12MB   (GC metadata / obtained form system)                  
        - OtherSys     1MB    (other system allocations)                            

sleeping 30s; check memory usage with 'top -p 4420' now       1 31403476693 ns/op
ok      sourcegraph.com/sourcegraph/appdash 31.448s

top

$ top -p 4420

top - 13:58:08 up 2 days, 11:05,  9 users,  load average: 0.22, 0.29, 0.41
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.6 us,  1.5 sy,  0.0 ni, 95.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   8088044 total,  7637256 used,   450788 free,   196244 buffers
KiB Swap: 15625212 total,   974936 used, 14650276 free.  1395648 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                          
 4420 stephen   20   0  241.0m 210.8m   5.7m S   1.7  2.7   0:01.58 appdash.test                     
slimsag commented 9 years ago

Test 5 (after Appdash PR #80)

Test Feature Enabled
10 explicit runtime.GC calls? Yes
10 explicit runtime/debug.FreeOSMemory calls? Yes
$ go test -tags=danger -memprofilerate=1 -memprofile=mem.out -bench=BenchmarkChunkedCollector1mil -run=NONE -v
PASS
BenchmarkChunkedCollector1mil   

[pre] 1000000-collections:
  General statistics
    - Alloc       80KB   (allocated and still in use)                  
    - TotalAlloc  860KB  (allocated (even if freed))                   
    - Sys         2MB    (obtained from system (sum of XxxSys below))  
    - Lookups     5      (number of pointer lookups)                   
    - Mallocs     15967  (number of mallocs)                           
    - Frees       15443  (number of frees)                             

  Main allocation heap statistics
    - HeapAlloc     80KB   (allocated and still in use)         
    - HeapSys       880KB  (obtained from system)               
    - HeapIdle      496KB  (in idle spans)                      
    - HeapInuse     384KB  (in non-idle span)                   
    - HeapReleased  0B     (released to the OS)                 
    - HeapObjects   524    (total number of allocated objects)  

  Low-level fixed-size structure allocator statistics
    - StackInuse   144KB  (used by stack allocator right now)                   
    - StackSys     144KB  (obtained from system)                                
    - MSpanInuse   4KB    (mspan structures / in use now)                       
    - MSpanSys     16KB   (obtained from system)                                
    - MCacheInuse  1KB    (in use now))                                         
    - MCacheSys    16KB   (mcache structures / obtained from system)            
    - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
    - GCSys        70KB   (GC metadata / obtained form system)                  
    - OtherSys     269KB  (other system allocations)                            

[post] 1000000-collections:
    General statistics
      - Alloc       63MB   (allocated and still in use)                  
      - TotalAlloc  224MB  (allocated (even if freed))                   
      - Sys         179MB  (obtained from system (sum of XxxSys below))  
      - Lookups     5      (number of pointer lookups)                   
      - Mallocs     76337  (number of mallocs)                           
      - Frees       37670  (number of frees)                             

    Main allocation heap statistics
      - HeapAlloc     63MB   (allocated and still in use)         
      - HeapSys       166MB  (obtained from system)               
      - HeapIdle      101MB  (in idle spans)                      
      - HeapInuse     64MB   (in non-idle span)                   
      - HeapReleased  0B     (released to the OS)                 
      - HeapObjects   38667  (total number of allocated objects)  

    Low-level fixed-size structure allocator statistics
      - StackInuse   176KB  (used by stack allocator right now)                   
      - StackSys     176KB  (obtained from system)                                
      - MSpanInuse   211KB  (mspan structures / in use now)                       
      - MSpanSys     224KB  (obtained from system)                                
      - MCacheInuse  1KB    (in use now))                                         
      - MCacheSys    16KB   (mcache structures / obtained from system)            
      - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
      - GCSys        10MB   (GC metadata / obtained form system)                  
      - OtherSys     1MB    (other system allocations)                            

[garbage collections]
  0. GC - 27.339702ms
  1. GC - 21.278411ms
  2. GC - 25.189139ms
  3. GC - 1.069467ms
  4. GC - 239.512µs
  5. GC - 203.882µs
  6. GC - 202.362µs
  7. GC - 199.638µs
  8. GC - 200.985µs
  9. GC - 201.5µs

[after GC] 1000000-collections:
      General statistics
        - Alloc       81KB   (allocated and still in use)                  
        - TotalAlloc  224MB  (allocated (even if freed))                   
        - Sys         179MB  (obtained from system (sum of XxxSys below))  
        - Lookups     5      (number of pointer lookups)                   
        - Mallocs     77615  (number of mallocs)                           
        - Frees       77070  (number of frees)                             

      Main allocation heap statistics
        - HeapAlloc     81KB   (allocated and still in use)         
        - HeapSys       166MB  (obtained from system)               
        - HeapIdle      165MB  (in idle spans)                      
        - HeapInuse     376KB  (in non-idle span)                   
        - HeapReleased  0B     (released to the OS)                 
        - HeapObjects   545    (total number of allocated objects)  

      Low-level fixed-size structure allocator statistics
        - StackInuse   176KB  (used by stack allocator right now)                   
        - StackSys     176KB  (obtained from system)                                
        - MSpanInuse   5KB    (mspan structures / in use now)                       
        - MSpanSys     224KB  (obtained from system)                                
        - MCacheInuse  1KB    (in use now))                                         
        - MCacheSys    16KB   (mcache structures / obtained from system)            
        - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
        - GCSys        10MB   (GC metadata / obtained form system)                  
        - OtherSys     1MB    (other system allocations)                            

[free os memory]
  0. FreeOSMemory - 6.295947ms
  1. FreeOSMemory - 300.451µs
  2. FreeOSMemory - 211.189µs
  3. FreeOSMemory - 209.474µs
  4. FreeOSMemory - 325.314µs
  5. FreeOSMemory - 333.704µs
  6. FreeOSMemory - 335.901µs
  7. FreeOSMemory - 323.038µs
  8. FreeOSMemory - 320.607µs
  9. FreeOSMemory - 210.299µs

[after GC & FreeOSMemory] 1000000-collections:
        General statistics
          - Alloc       81KB   (allocated and still in use)                  
          - TotalAlloc  224MB  (allocated (even if freed))                   
          - Sys         179MB  (obtained from system (sum of XxxSys below))  
          - Lookups     5      (number of pointer lookups)                   
          - Mallocs     78927  (number of mallocs)                           
          - Frees       78382  (number of frees)                             

        Main allocation heap statistics
          - HeapAlloc     81KB   (allocated and still in use)         
          - HeapSys       166MB  (obtained from system)               
          - HeapIdle      165MB  (in idle spans)                      
          - HeapInuse     376KB  (in non-idle span)                   
          - HeapReleased  165MB  (released to the OS)                 
          - HeapObjects   545    (total number of allocated objects)  

        Low-level fixed-size structure allocator statistics
          - StackInuse   176KB  (used by stack allocator right now)                   
          - StackSys     176KB  (obtained from system)                                
          - MSpanInuse   5KB    (mspan structures / in use now)                       
          - MSpanSys     224KB  (obtained from system)                                
          - MCacheInuse  1KB    (in use now))                                         
          - MCacheSys    16KB   (mcache structures / obtained from system)            
          - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
          - GCSys        10MB   (GC metadata / obtained form system)                  
          - OtherSys     1MB    (other system allocations)                            

sleeping 30s; check memory usage with 'top -p 8582' now       1 30877967166 ns/op
ok      sourcegraph.com/sourcegraph/appdash 30.912s

top

$ top -p 8582

top - 16:42:49 up 2 days, 13:50,  9 users,  load average: 0.33, 0.47, 0.48
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.4 us,  2.0 sy,  0.0 ni, 95.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   8088044 total,  7421972 used,   666072 free,   143948 buffers
KiB Swap: 15625212 total,  1011504 used, 14613708 free.  1087780 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                          
 8582 stephen   20   0  215.4m  19.7m   5.7m S   9.6  0.2   0:02.07 appdash.test                     
slimsag commented 9 years ago

Confirmed as fixed in Go 1.5 (ensure you properly rm -rf $GOPATH/pkg when switching for testing purposes!), make note of the very low RES usage:

Index / Matrix

(none of these called GC or FreeOSMemory manually)

Go Version top VIRT top RES HeapObjects (pre, post) TotalAlloc
Go 1.5 133.1m 13.1m (721, 1372) 196MB
Go 1.4.2 195.9m 165.2m (525, 26204) 176MB

With Go 1.5 @ master

$ go test -tags=danger -memprofilerate=1 -memprofile=mem.out -bench=BenchmarkChunkedCollector1mil -run=NONE -v
PASS
BenchmarkChunkedCollector1mil-4 

[pre] 1000000-collections:
  General statistics
    - Alloc       157KB  (allocated and still in use)                  
    - TotalAlloc  289KB  (allocated (even if freed))                   
    - Sys         3MB    (obtained from system (sum of XxxSys below))  
    - Lookups     5      (number of pointer lookups)                   
    - Mallocs     2328   (number of mallocs)                           
    - Frees       1607   (number of frees)                             

  Main allocation heap statistics
    - HeapAlloc     157KB  (allocated and still in use)         
    - HeapSys       672KB  (obtained from system)               
    - HeapIdle      224KB  (in idle spans)                      
    - HeapInuse     448KB  (in non-idle span)                   
    - HeapReleased  0B     (released to the OS)                 
    - HeapObjects   721    (total number of allocated objects)  

  Low-level fixed-size structure allocator statistics
    - StackInuse   352KB  (used by stack allocator right now)                   
    - StackSys     352KB  (obtained from system)                                
    - MSpanInuse   6KB    (mspan structures / in use now)                       
    - MSpanSys     16KB   (obtained from system)                                
    - MCacheInuse  4KB    (in use now))                                         
    - MCacheSys    16KB   (mcache structures / obtained from system)            
    - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
    - GCSys        34KB   (GC metadata / obtained form system)                  
    - OtherSys     1MB    (other system allocations)                            

[post] 1000000-collections:
    General statistics
      - Alloc       2MB    (allocated and still in use)                  
      - TotalAlloc  196MB  (allocated (even if freed))                   
      - Sys         8MB    (obtained from system (sum of XxxSys below))  
      - Lookups     5      (number of pointer lookups)                   
      - Mallocs     64928  (number of mallocs)                           
      - Frees       63556  (number of frees)                             

    Main allocation heap statistics
      - HeapAlloc     2MB   (allocated and still in use)         
      - HeapSys       5MB   (obtained from system)               
      - HeapIdle      2MB   (in idle spans)                      
      - HeapInuse     2MB   (in non-idle span)                   
      - HeapReleased  0B    (released to the OS)                 
      - HeapObjects   1372  (total number of allocated objects)  

    Low-level fixed-size structure allocator statistics
      - StackInuse   416KB  (used by stack allocator right now)                   
      - StackSys     416KB  (obtained from system)                                
      - MSpanInuse   12KB   (mspan structures / in use now)                       
      - MSpanSys     32KB   (obtained from system)                                
      - MCacheInuse  4KB    (in use now))                                         
      - MCacheSys    16KB   (mcache structures / obtained from system)            
      - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
      - GCSys        195KB  (GC metadata / obtained form system)                  
      - OtherSys     981KB  (other system allocations)                            

sleeping 30s; check memory usage with 'top -p 13006' now       1    30528844025 ns/op
ok      sourcegraph.com/sourcegraph/appdash 30.557s

top

$ top -p 13006

top - 18:38:41 up 2 days, 15:45,  6 users,  load average: 0.20, 0.57, 0.66
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.9 us,  2.1 sy,  0.0 ni, 95.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   8088044 total,  7318096 used,   769948 free,   164984 buffers
KiB Swap: 15625212 total,  1023112 used, 14602100 free.  1254680 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                          
13006 stephen   20   0  133.1m  13.1m   6.0m S  12.2  0.2   0:01.44 appdash.test                     

With 1.4.2 (linux/amd64 binary release)

$ go test -tags=danger -memprofilerate=1 -memprofile=mem.out -bench=BenchmarkChunkedCollector1mil -run=NONE -v
PASS
BenchmarkChunkedCollector1mil   

[pre] 1000000-collections:
  General statistics
    - Alloc       79KB   (allocated and still in use)                  
    - TotalAlloc  862KB  (allocated (even if freed))                   
    - Sys         2MB    (obtained from system (sum of XxxSys below))  
    - Lookups     5      (number of pointer lookups)                   
    - Mallocs     15972  (number of mallocs)                           
    - Frees       15447  (number of frees)                             

  Main allocation heap statistics
    - HeapAlloc     79KB   (allocated and still in use)         
    - HeapSys       864KB  (obtained from system)               
    - HeapIdle      488KB  (in idle spans)                      
    - HeapInuse     376KB  (in non-idle span)                   
    - HeapReleased  0B     (released to the OS)                 
    - HeapObjects   525    (total number of allocated objects)  

  Low-level fixed-size structure allocator statistics
    - StackInuse   160KB  (used by stack allocator right now)                   
    - StackSys     160KB  (obtained from system)                                
    - MSpanInuse   4KB    (mspan structures / in use now)                       
    - MSpanSys     16KB   (obtained from system)                                
    - MCacheInuse  1KB    (in use now))                                         
    - MCacheSys    16KB   (mcache structures / obtained from system)            
    - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
    - GCSys        70KB   (GC metadata / obtained form system)                  
    - OtherSys     268KB  (other system allocations)                            

[post] 1000000-collections:
    General statistics
      - Alloc       85MB   (allocated and still in use)                  
      - TotalAlloc  176MB  (allocated (even if freed))                   
      - Sys         159MB  (obtained from system (sum of XxxSys below))  
      - Lookups     5      (number of pointer lookups)                   
      - Mallocs     73374  (number of mallocs)                           
      - Frees       47170  (number of frees)                             

    Main allocation heap statistics
      - HeapAlloc     85MB   (allocated and still in use)         
      - HeapSys       147MB  (obtained from system)               
      - HeapIdle      61MB   (in idle spans)                      
      - HeapInuse     85MB   (in non-idle span)                   
      - HeapReleased  0B     (released to the OS)                 
      - HeapObjects   26204  (total number of allocated objects)  

    Low-level fixed-size structure allocator statistics
      - StackInuse   192KB  (used by stack allocator right now)                   
      - StackSys     192KB  (obtained from system)                                
      - MSpanInuse   142KB  (mspan structures / in use now)                       
      - MSpanSys     144KB  (obtained from system)                                
      - MCacheInuse  1KB    (in use now))                                         
      - MCacheSys    16KB   (mcache structures / obtained from system)            
      - BuckHashSys  1MB    (profiling bucket hash table / obtained from system)  
      - GCSys        9MB    (GC metadata / obtained form system)                  
      - OtherSys     900KB  (other system allocations)                            

sleeping 30s; check memory usage with 'top -p 14351' now       1    30844105328 ns/op
ok      sourcegraph.com/sourcegraph/appdash 30.885s

top

$ top -p 14351

top - 18:48:30 up 2 days, 15:55,  6 users,  load average: 0.74, 0.67, 0.68
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.5 us,  1.9 sy,  0.0 ni, 95.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   8088044 total,  7484804 used,   603240 free,   167964 buffers
KiB Swap: 15625212 total,  1022664 used, 14602548 free.  1314516 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                          
14351 stephen   20   0  195.9m 165.2m   5.6m S   9.0  2.1   0:01.72 appdash.test