Tokutek / mongo

TokuMX is a high-performance, concurrent, compressing, drop-in replacement engine for MongoDB | Issue tracker: https://tokutek.atlassian.net/browse/MX/ |
http://www.tokutek.com/products/tokumx-for-mongodb/
704 stars 97 forks source link

Make mongodump multithreaded #1207

Closed malexejev closed 9 years ago

malexejev commented 9 years ago

Typical top stats when mongodump works:

top - 16:18:59 up 26 min,  1 user,  load average: 2.56, 1.60, 0.84
Tasks: 153 total,   4 running, 149 sleeping,   0 stopped,   0 zombie
Cpu0  : 69.0%us,  5.4%sy,  0.0%ni, 22.6%id,  2.7%wa,  0.0%hi,  0.0%si,  0.3%st
Cpu1  :  7.7%us,  0.3%sy,  0.0%ni, 91.6%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  4.6%us,  0.7%sy,  0.0%ni, 94.4%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  : 16.3%us,  1.0%sy,  0.0%ni, 82.4%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  6.0%us,  0.3%sy,  0.0%ni, 93.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  4.3%us,  0.0%sy,  0.0%ni, 95.3%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  1.0%us,  0.3%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  0.7%us,  0.0%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  :  0.0%us,  4.0%sy,  0.0%ni, 96.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  :  0.3%us,  0.0%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu11 : 91.7%us,  8.3%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu12 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu13 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu14 : 84.4%us,  8.9%sy,  0.0%ni,  6.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu15 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  62163604k total, 61878156k used,   285448k free,   168708k buffers
Swap:        0k total,        0k used,        0k free, 28322392k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                       
 1770 mongo     20   0 36.1g  30g 5336 S  102 52.3  12:20.08 mongodump                                                                                                                                      
 1285 ubuntu    20   0 19312 7256  816 R  100  0.0   1:12.33 rsync                                                                                                                                          
 1366 ubuntu    20   0 19136 6996  816 R  100  0.0   1:23.55 rsync                                                                                                                                          
 1447 ubuntu    20   0 14236 1976  816 R    6  0.0   0:05.97 rsync                                                                                                                                          

Typical disk stats when mongodump works:

ubuntu@ip-10-36-68-72:~$ iostat -xmt
Linux 3.2.0-40-virtual (ip-10-36-68-72)     10/24/2014  _x86_64_    (16 CPU)

10/24/2014 04:20:04 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.42    0.00    0.80    0.43    1.40   93.95

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
xvdap1            0.30     0.73   13.14    0.51     0.08     0.01    12.74     0.02    1.45    1.11    9.98   0.12   0.17
xvdb              0.10     0.70  489.85    6.52    19.87     0.26    83.04     0.76    1.53    1.45    7.22   0.10   4.94
xvdc              0.11     0.63  489.82    6.44    19.87     0.26    83.06     0.76    1.52    1.44    7.92   0.10   4.98
xvdl              4.32     0.02    3.09    0.01     0.03     0.00    19.19     0.00    0.76    0.76    2.22   0.49   0.15
md127             0.00     0.00  979.83   14.30    39.73     0.52    82.92     0.00    0.00    0.00    0.00   0.00   0.00
xvdf              0.22     4.92    2.89  116.30     0.01     4.84    83.34     2.51   21.09    0.52   21.61   0.29   3.49
md7               0.00     0.00    8.95  363.44     0.03    14.52    80.02     0.00    0.00    0.00    0.00   0.00   0.00
xvdg              0.08     4.85    2.87  116.28     0.01     4.84    83.36     3.11   26.10    0.65   26.72   0.31   3.70
xvdh              0.16     4.79    2.80  116.30     0.01     4.84    83.40     3.81   32.02    0.65   32.78   0.33   3.93

So the obvious bottleneck is a single CPU core usage for mongodump. Making this one multithreaded will potentially speed up backup process x10 and more.

esmet commented 9 years ago

Hi, we've migrated issue trackers to jira:

https://tokutek.atlassian.net/secure/Dashboard.jspa

malexejev commented 9 years ago

Ah, don't want to register in just another Jira instance. Is it possible to setup issues migration from GH to JiraOnDemand? Otherwise, it would be better to just close issues section in this repo.

On 24 Oct 2014, at 20:30, John Esmet notifications@github.com wrote:

Hi, we've migrated issue trackers to jira:

https://tokutek.atlassian.net/secure/Dashboard.jspa

— Reply to this email directly or view it on GitHub.