read 110MB datas from mongodb to memory, but globalsign/mgo package use 1.2GB memory

orange-jacky commented 6 years ago

hi, i use globalsign/mgo pakage in my project, and i want read a 110MB datas that it's 200000 records. After using runtime/pprof, i find that 110MB datas save in a map, it's size 110MB, but iter use about 600MB+; "github.com/globalsign/mgo/bson.(*decoder).readElemTo" use about 600MB+. can you tell me how to optimize it hope your reply

macbookpro:fpr_index fredlee$ go tool pprof fpr_index logs_profile/mem-profile-2018-07-20_12-39-48.prof File: fpr_index Type: inuse_space Time: Jul 20, 2018 at 12:39pm (CST) Entering interactive mode (type "help" for commands, "o" for options) (pprof) top5 Showing nodes accounting for 666.81MB, 91.82% of 726.19MB total Dropped 26 nodes (cum <= 3.63MB) Showing top 5 nodes out of 25 flat flat% sum% cum cum% 298.16MB 41.06% 41.06% 298.16MB 41.06% reflect.mapassign 118MB 16.25% 57.31% 118MB 16.25% github.com/globalsign/mgo/bson.(decoder).readStr 111.14MB 15.30% 72.61% 713.31MB 98.23% huoli/fpr_index/util.(Orderinfos).LoadFromDb 96MB 13.22% 85.83% 602.30MB 82.94% github.com/globalsign/mgo/bson.(decoder).readElemTo 43.50MB 5.99% 91.82% 43.50MB 5.99% github.com/globalsign/mgo/bson.(decoder).readCStr (pprof) list LoadFromDb Total: 726.19MB ROUTINE ======================== huoli/fpr_index/util.(Orderinfos).LoadFromDb in /Users/fredlee/Documents/develop/go/workspace/src/huoli/fpr_index/util/mongo_orderinfo.go 111.14MB 713.31MB (flat, cum) 98.23% of Total . . 54: mongo.C() . . 55: . . 56: _map := make(map[string]Orderinfo) . . 57: var orderinfo Orderinfo . . 58: iter := mongo.Session.C.Find(nil).Iter() . 602.17MB 59: for iter.Next(&orderinfo) { . . 60: item := orderinfo 111.14MB 111.14MB 61: _map[item.Id] = item . . 62: } . . 63: if err := iter.Close(); err != nil { . . 64: return num, err . . 65: } . . 66: num = len(_map) (pprof) list readElemTo Total: 726.19MB ROUTINE ======================== github.com/globalsign/mgo/bson.(decoder).readElemTo in /Users/fredlee/Documents/develop/go/workspace/src/github.com/globalsign/mgo/bson/decode.go 96MB 1.18GB (flat, cum) 166.15% of Total . . 573:} . . 574: . . 575:// Attempt to decode an element from the document and put it into out. . . 576:// If the types are not compatible, the returned ok value will be . . 577:// false and out will be unchanged. 22.50MB 22.50MB 578:func (d decoder) readElemTo(out reflect.Value, kind byte) (good bool) { . . 579: outt := out.Type() . . 580: . . 581: if outt == typeRaw { . . 582: out.Set(reflect.ValueOf(d.readRaw(kind))) . . 583: return true . . 584: } . . 585: . . 586: if outt == typeRawPtr { . . 587: raw := d.readRaw(kind) . . 588: out.Set(reflect.ValueOf(&raw)) . . 589: return true . . 590: } . . 591: . . 592: if kind == ElementDocument { . . 593: // Delegate unmarshaling of documents. . . 594: outt := out.Type() . . 595: outk := out.Kind() . . 596: switch outk { . . 597: case reflect.Interface, reflect.Ptr, reflect.Struct, reflect.Map: . 482.79MB 598: d.readDocTo(out) . . 599: return true . . 600: } . . 601: if setterStyle(outt) != setterNone { . . 602: d.readDocTo(out) . . 603: return true . . 604: } . . 605: if outk == reflect.Slice { . . 606: switch outt.Elem() { . . 607: case typeDocElem: . . 608: out.Set(d.readDocElems(outt)) . . 609: case typeRawDocElem: . . 610: out.Set(d.readRawDocElems(outt)) . . 611: default: . . 612: d.dropElem(kind) . . 613: } . . 614: return true . . 615: } . . 616: d.dropElem(kind) . . 617: return true . . 618: } . . 619: . . 620: if setter := getSetter(outt, out); setter != nil { . . 621: err := setter.SetBSON(d.readRaw(kind)) . . 622: if err == ErrSetZero { . . 623: out.Set(reflect.Zero(outt)) . . 624: return true . . 625: } . . 626: if err == nil { . . 627: return true . . 628: } . . 629: if _, ok := err.(TypeError); !ok { . . 630: panic(err) . . 631: } . . 632: return false . . 633: } . . 634: . . 635: var in interface{} . . 636: . . 637: switch kind { . . 638: case ElementFloat64: 14.50MB 14.50MB 639: in = d.readFloat64() . . 640: case ElementString: 59MB 177.01MB 641: in = d.readStr() . . 642: case ElementDocument: . . 643: panic("Can't happen. Handled above.") . . 644: case ElementArray: . . 645: outt := out.Type() . . 646: if setterStyle(outt) != setterNone { . . 647: // Skip the value so its data is handed to the setter below. . . 648: d.dropElem(kind) . . 649: break . . 650: } . . 651: for outt.Kind() == reflect.Ptr { . . 652: outt = outt.Elem() . . 653: } . . 654: switch outt.Kind() { . . 655: case reflect.Array: . . 656: d.readArrayDocTo(out) . . 657: return true . . 658: case reflect.Slice: . 641.34kB 659: in = d.readSliceDoc(outt) . . 660: default: . 509.17MB 661: in = d.readSliceDoc(typeSlice) . . 662: } . . 663: case ElementBinary: . . 664: b := d.readBinary() . . 665: if b.Kind == BinaryGeneric || b.Kind == BinaryBinaryOld { . . 666: in = b.Data (pprof)

domodwyer commented 6 years ago

Hi @orange-jacky

This is most likely an application issue, not mgo - please post a reproducible example if you think it's definitely mgo.

Either way, readElemTo is the method that allocates the types to store the document fields - it's always going to allocate extensively - it has to do so both for your documents, and the mongodb protocol messages / any communication with mongo. I guess you're using pprof in alloc_space mode which would show this.

Dom

orange-jacky commented 6 years ago

hi, domodwyer i post the my project and a record, you can reproduce it env 1.go1.10.3 2.Linux 2.6.32-642.13.1.el6.x86_64 3.mongo3.2

you can download fpr_index.zip, unzip it into $GOPATH/huoli, go install huoli/fpr_index compile it. you can download record.txt which has a record, you can dump to 200000 records with it than you should configure the fpr_index/conf/cf.json, change mongo host info. execute " ps -ef | grep fpr " get fpr pid, and top -p pid you will see that fpr_index use lots of memory. cd $GOPATH/huoli/fpr_index ,to open cpu and memory profile, you should execute echo "on" > conf/profile_switch.txt, after a while, to close cpu and memory profile , you should excute echo "off" > conf/profile_switch.txt, and you will see memory and cpu profile file in logs_profile.

globalsign / mgo

read 110MB datas from mongodb to memory, but globalsign/mgo package use 1.2GB memory #221