src-d / borges

borges collects and stores Git repositories.
https://docs.sourced.tech/borges/
GNU General Public License v3.0
52 stars 20 forks source link

Profile Borges node to discover bottlenecks #156

Closed ajnavarro closed 7 years ago

ajnavarro commented 7 years ago

Plan:

bzz commented 7 years ago

After local profiling, there are 2 main slow points:

To reproduce

   docker run --name some-postgres -e POSTGRES_PASSWORD=testing -p 5432:5432 -e POSTGRES_USER=testing -d postgress
   docker run -d --hostname rabbit --name rabbit -p 8080:15672 -p 5672:5672 rabbitmq:3-management
   curl -s -u guest:guest "http://localhost:8081/api/queues/%2F/borges" | jq .messages
   curl -s -u guest:guest "http://localhost:8081/api/queues/%2F/borges.buriedQueue" | jq .messages
   docker exec -ti some-postgres psql -U testing
   # select count(status) as num, status from repositories group by status;
   make dependencies
   make packages

   cat repos2000.txt | perl -MList::Util=shuffle -e 'print shuffle(<>);' | head -40 > repos40.txt

   ./build/borges_darwin_amd64/borges init
   ./build/borges_darwin_amd64/borges producer --source=file --file ~/src-d/pipeline/repos40.txt
   ./build/borges_darwin_amd64/borges consumer --workers=4 --loglevel=debug --profiler
   go tool pprof $GOPATH/src/github.com/src-d/borges/build/borges_darwin_amd64/borges http://localhost:6060/debug/pprof/profile
erizocosmico commented 7 years ago

Fix for the first bottleneck: https://github.com/src-d/borges/pull/157

erizocosmico commented 7 years ago

Fix for the i/o bottleneck: https://github.com/src-d/go-git/pull/575

UPDATE: nvm, fix does not work

bzz commented 7 years ago

For remote profiling, 2 next k8s descriptors were used:

kubectl create -f borges-consumer-profiler-service.yml
kubectl apply -f borges-consumer.yaml
kubectl describe svc borges-consumer-profiler
**borges-consumer-profiler-service.yml** ```yml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: borges-consumer spec: replicas: 3 template: metadata: labels: app: borges-consumer spec: nodeSelector: srcd.host/type: worker containers: - name: borges image: quay.io/srcd/borges:v0.6.3 ports: - containerPort: 6061 command: ['borges','consumer','--workers=12','--loglevel=debug', '--timeout=2h', '--profiler' ] env: - name: CONFIG_BROKER value: 'amqp://guest:guest@rabbitmq:5672' - name: CONFIG_DBUSER value: testing - name: CONFIG_DBPASS value: testing - name: CONFIG_DBHOST value: postgres - name: CONFIG_DBNAME value: testing - name: CONFIG_TEMP_DIR value: /borges/tmp - name: CONFIG_ROOT_REPOSITORIES_DIR value: /borges/root-repositories - name: CONFIG_LOCKING value: 'etcd:http://etcd:2379' - name: HADOOP_USER_NAME value: root - name: CONFIG_HDFS value: 'hdfs-namenode:8020' imagePullSecrets: - name: quay ``` **borges-consumer-profiler-service.yml** ```yml apiVersion: v1 kind: Service metadata: name: borges-consumer-profiler labels: app: borges-consumer spec: type: ClusterIP ports: - port: 6061 targetPort: 6061 protocol: TCP name: borges-consumer-profiler selector: app: borges-consumer ```

60s profile from the staging Cluster

cluster-profile001-60s

To reproduce these results localy - build Borges and use pprof.borges.10.3.0.186:6061.samples.cpu.002.pb.gz

$ wget https://github.com/src-d/borges/files/1274451/pprof.borges.10.3.0.186.6061.samples.cpu.002.pb.gz
$ go tool pprof $GOPATH/src/github.com/src-d/borges/build/borges_linux_amd64/borges pprof.borges.10.3.0.186:6061.samples.cpu.002.pb.gz
$(pprof) png
bzz commented 7 years ago

Latest 30 min profile of the consumer

(pprof) top
145.04mins of 284.29mins total (51.02%)
Dropped 1486 nodes (cum <= 1.42mins)
Showing top 10 nodes out of 202 (cum >= 7.66mins)
      flat  flat%   sum%        cum   cum%
 34.99mins 12.31% 12.31%  39.05mins 13.74%  runtime.evacuate
 21.37mins  7.52% 19.82%  21.39mins  7.52%  github.com/src-d/borges/vendor/github.com/colinmarc/hdfs/vendor/github.com/golang/protobuf/proto.(*TextMarshaler).writeStruct
 19.50mins  6.86% 26.68%  19.50mins  6.86%  runtime.heapBitsSetTypeGCProg
 13.56mins  4.77% 31.45%  13.56mins  4.77%  github.com/src-d/borges/vendor/gopkg.in/src-d/go-git.v4/plumbing/format/config.(*Encoder).encodeSection
 12.51mins  4.40% 35.85%  12.75mins  4.48%  runtime.(*mspan).sweep
  9.41mins  3.31% 39.16%  58.60mins 20.61%  github.com/src-d/borges/vendor/gopkg.in/src-d/go-git.v4/storage/memory.(*ConfigStorage).Config
  9.08mins  3.19% 42.35%  41.39mins 14.56%  runtime.sweepone
  8.58mins  3.02% 45.37%      9mins  3.16%  time.Time.AppendFormat
  8.40mins  2.96% 48.33%   8.40mins  2.96%  runtime.memmove
  7.66mins  2.69% 51.02%   7.66mins  2.69%  math/big.nat.scan

(pprof) top -cum
10.64s of 17057.39s total (0.062%)
Dropped 1486 nodes (cum <= 85.29s)
Showing top 10 nodes out of 202 (cum >= 5284.35s)
      flat  flat%   sum%        cum   cum%
     0.78s 0.0046% 0.0046%   7561.14s 44.33%  github.com/src-d/borges/vendor/gopkg.in/src-d/go-git.v4/plumbing/protocol/packp.encodeRefs
     0.81s 0.0047% 0.0093%   7396.36s 43.36%  github.com/src-d/borges/vendor/gopkg.in/src-d/go-git.v4/plumbing/protocol/packp.(*ReportStatus).Encode
     0.64s 0.0038% 0.013%   7021.69s 41.17%  github.com/src-d/borges/vendor/gopkg.in/src-d/go-git.v4/plumbing/protocol/packp.(*ReportStatus).Decode
     3.67s 0.022% 0.035%   5879.49s 34.47%  github.com/src-d/borges/vendor/gopkg.in/src-d/go-git.v4/plumbing/protocol/packp.(*ReportStatus).decodeCommandStatus
     0.15s 0.00088% 0.035%   5423.57s 31.80%  github.com/src-d/borges/vendor/gopkg.in/src-d/go-git.v4/plumbing/protocol/packp.formatCaps
     3.30s 0.019% 0.055%   5408.41s 31.71%  github.com/src-d/borges/vendor/gopkg.in/src-d/go-git.v4/plumbing/protocol/packp.(*ulReqEncoder).encodeDepth
         0     0% 0.055%   5406.24s 31.69%  github.com/src-d/borges/vendor/gopkg.in/src-d/go-git.v4/plumbing/protocol/packp.encodeFirstLine
     0.16s 0.00094% 0.056%   5349.13s 31.36%  compress/flate.NewReader
     1.13s 0.0066% 0.062%   5305.54s 31.10%  github.com/src-d/borges/vendor/gopkg.in/src-d/go-git.v4/plumbing/protocol/packp.NewReferenceUpdateRequestFromCapabilities
         0     0% 0.062%   5284.35s 30.98%  github.com/src-d/borges/vendor/gopkg.in/src-d/go-git.v4/plumbing/protocol/packp.(*UploadRequest).validateConflictCapabilities

borges-consumer-profile-30m

borges-consumer-profile-30m.svg.zip

erizocosmico commented 7 years ago

Fix for the push bottlenecks: https://github.com/src-d/go-git/pull/582