jmoiron / sqlx

general purpose extensions to golang's database/sql
http://jmoiron.github.io/sqlx/
MIT License
16.06k stars 1.08k forks source link

rows.MapScan holding onto too much memory #192

Open omerkirk opened 8 years ago

omerkirk commented 8 years ago

Hi Jason,

As you can see from my previous issues, we are using your library extensively in a new project. We were seeing a steady memory increase in htop. The memory increases until the app crashes a day later. When we profiled it with pprof we saw an unusual amount of memory used by only the MapScan function in sqlx. We defer close both the statements and the rows returned by the library so I really don't understand where this memory is or why isn't it released back to the OS. Hopefully the following output would mean more to you than me.

(pprof) top10
6853.45MB of 6921.01MB total (99.02%)
Dropped 322 nodes (cum <= 34.61MB)
Showing top 10 nodes out of 79 (cum >= 49.57MB)
      flat  flat%   sum%        cum   cum%
 5346.59MB 77.25% 77.25%  5348.09MB 77.27%  github.com/jmoiron/sqlx.MapScan
  630.01MB  9.10% 86.35%   630.01MB  9.10%  gopkg.in/goracle.v1/oracle.numberVarGetValue
  368.51MB  5.32% 91.68%   368.51MB  5.32%  gopkg.in/goracle.v1/oracle.stringVarGetValue
  161.80MB  2.34% 94.02%   161.80MB  2.34%  bytes.makeSlice
   90.21MB  1.30% 95.32%  5848.20MB 84.50%  sportsapi/repository.(*MatchRepo).queryMatches
   87.50MB  1.26% 96.58%  1146.02MB 16.56%  gopkg.in/goracle.v1/oracle.(*Variable).getSingleValueInto
   60.32MB  0.87% 97.46%    61.32MB  0.89%  gopkg.in/goracle.v1/oracle.(*Variable).allocateData
      60MB  0.87% 98.32%       60MB  0.87%  gopkg.in/goracle.v1/oracle.dateTimeVarGetValue
      39MB  0.56% 98.89%   823.87MB 11.90%  sportsapi/repository.(*MatchRepo).queryList
    9.51MB  0.14% 99.02%    49.57MB  0.72%  sportsapi/repository.(*MatchRepo).queryMatch
(pprof) list MapScan
Total: 6.76GB
ROUTINE ======================== github.com/jmoiron/sqlx.(*Row).MapScan in /home/kokteyl/go/src/github.com/jmoiron/sqlx/sqlx.go
         0     9.51MB (flat, cum)  0.14% of Total
         .          .    676:   return SliceScan(r)
         .          .    677:}
         .          .    678:
         .          .    679:// MapScan using this Rows.
         .          .    680:func (r *Row) MapScan(dest map[string]interface{}) error {
         .     9.51MB    681:   return MapScan(r, dest)
         .          .    682:}
         .          .    683:
         .          .    684:func (r *Row) scanAny(dest interface{}, structOnly bool) error {
         .          .    685:   if r.err != nil {
         .          .    686:           return r.err
ROUTINE ======================== github.com/jmoiron/sqlx.(*Rows).MapScan in /home/kokteyl/go/src/github.com/jmoiron/sqlx/sqlx.go
         0     5.21GB (flat, cum) 77.14% of Total
         .          .    541:   return SliceScan(r)
         .          .    542:}
         .          .    543:
         .          .    544:// MapScan using this Rows.
         .          .    545:func (r *Rows) MapScan(dest map[string]interface{}) error {
         .     5.21GB    546:   return MapScan(r, dest)
         .          .    547:}
         .          .    548:
         .          .    549:// StructScan is like sql.Rows.Scan, but scans a single Row into a single Struct.
         .          .    550:// Use this and iterate over Rows manually when the memory load of Select() might be
         .          .    551:// prohibitive.  *Rows.StructScan caches the reflect work of matching up column
ROUTINE ======================== github.com/jmoiron/sqlx.MapScan in /home/kokteyl/go/src/github.com/jmoiron/sqlx/sqlx.go
    5.22GB     5.22GB (flat, cum) 77.27% of Total
         .          .    774:// This will modify the map sent to it in place, so reuse the same map with
         .          .    775:// care.  Columns which occur more than once in the result will overwrite
         .          .    776:// eachother!
         .          .    777:func MapScan(r ColScanner, dest map[string]interface{}) error {
         .          .    778:   // ignore r.started, since we needn't use reflect for anything.
         .          .    779:   columns, err := r.Columns()
         .          .    780:   if err != nil {
         .          .    781:           return err
         .          .    782:   }
         .          .    783:
         .          .    784:   values := make([]interface{}, len(columns))
         .          .    785:   for i := range values {
  512.01kB   512.01kB    786:           values[i] = new(interface{})
         .          .    787:   }
         .          .    788:
         .     1.50MB    789:   err = r.Scan(values...)
         .          .    790:   if err != nil {
         .          .    791:           return err
         .          .    792:   }
         .          .    793:
         .          .    794:   for i, column := range columns {
    5.22GB     5.22GB    795:           dest[column] = *(values[i].(*interface{}))
         .          .    796:   }
         .          .    797:
         .          .    798:   return r.Err()
         .          .    799:}
         .          .    800:
heyimalex commented 8 years ago

I think that line with 5.2GB is mostly just internal allocations from dest resizing itself. Doing a similar test with 5 columns I had zero bytes allocated on that line. Up the column count to 1000 and I'm at 1.7GB. Pre-allocate by passing in 1000 as the size when makeing dest an I'm down to 13MB.

(Had a bit here about a possible optimization, but it wouldn't work)

Anyways, I'm not sure if that profile gives us the answer. My test code might not have been super accurate though.