hibari / hibari-brick-rs

A fast, embedded, ordered key-value store for big and small values
Other
10 stars 1 forks source link

CircleCI build fails while buiding RocksDB by exceeding the 4GB memory limit on 1 container #3

Closed tatsuya6502 closed 7 years ago

tatsuya6502 commented 7 years ago

RocksDB does not build on a CircleCI container because it runs too many C++ compilers (cc1plus) at the time and exceeds the 4GB memory limit.

https://circleci.com/gh/hibari/hibari-brick-rs/22

Your build has exceeded the memory limit of 4G on 1 container. The results of this build are likely invalid. We have taken a snapshot of the memory usage at the time, which you can find in a build artifact named memory-usage.txt. The RSS column in this file shows the amount of memory used by each process, measured in kilobytes.

In memory-usage.txt, I can see there were 30 cc1plus processes running and some of them were taking 200MB RAM on each.

librocksdb-sys crate is the one that builds RocksDB. Need to find a way to control the number of parallel builds in librocksdb-sys.

tatsuya6502 commented 7 years ago

I found that librocksdb-sys crate uses gcc-rs crate with optional parallel build enabled.

librocksdb-sys/Cargo.toml

gcc = { version = "0.3", features = ["parallel"] }

From gcc-rs's document: https://github.com/alexcrichton/gcc-rs#optional-features

By default gcc-rs will limit parallelism to $NUM_JOBS, or if not present it will limit it to the number of cpus on the machine.

tatsuya6502 commented 7 years ago

I am rerunning CI with SSH debug enabled. I believe the build machine has 32 processor cores.

ubuntu@box3165:~$ cat /proc/cpuinfo 
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 62
model name  : Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz

...

processor   : 31
vendor_id   : GenuineIntel
cpu family  : 6
model       : 62
model name  : Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
...

ubuntu@box3165:~$ cat /proc/cpuinfo | grep -c processor
32
ubuntu@box3165:~$ nproc 
2
tatsuya6502 commented 7 years ago

Adding environment variable NUM_JOBS: 4 to circle.yaml seems to have no effect. I will debug gcc-rs crate or underlying Rayon crate.

top in the build container for https://circleci.com/gh/hibari/hibari-brick-rs/25

top - 00:21:57 up 1 day,  7:56,  2 users,  load average: 19.96, 15.94, 12.55
Tasks: 114 total,  33 running,  81 sleeping,   0 stopped,   0 zombie
%Cpu(s): 19.8 us,  2.9 sy,  0.0 ni, 76.9 id,  0.2 wa,  0.0 hi,  0.1 si,  0.1 st
KiB Mem:  61837020 total, 51297408 used, 10539612 free,   263076 buffers
KiB Swap:        0 total,        0 used,        0 free. 25941292 cached Mem

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                         
 23843 ubuntu    20   0  141732 106384   5040 R   8.6  0.2   0:01.09 cc1plus                                         
 23859 ubuntu    20   0  129356  91132   5040 R   7.6  0.1   0:00.98 cc1plus                                         
 23882 ubuntu    20   0  124248  84744   5016 R   7.6  0.1   0:00.76 cc1plus                                         
 23781 ubuntu    20   0  316660 127372  42032 S   7.3  0.2   0:02.29 rustc                                           
 23911 ubuntu    20   0  152324 119472   5080 R   7.3  0.2   0:01.45 cc1plus                                         
 23851 ubuntu    20   0  150532 110852   5048 R   7.0  0.2   0:01.19 cc1plus                                         
 23915 ubuntu    20   0  151920 112988   5048 R   6.3  0.2   0:01.24 cc1plus                                         
 23981 ubuntu    20   0  195172  42480  16104 S   6.0  0.1   0:00.18 rustc                                           
 23855 ubuntu    20   0  122188  83696   5016 R   5.3  0.1   0:00.78 cc1plus                                         
 23873 ubuntu    20   0   99132  66040   5004 R   5.0  0.1   0:00.74 cc1plus                                         
 23888 ubuntu    20   0  164656 124532   5048 R   5.0  0.2   0:01.32 cc1plus                                         
 23841 ubuntu    20   0  103572  68552   5396 R   3.7  0.1   0:00.76 cc1plus                                         
 23850 ubuntu    20   0  114568  78664   5016 R   3.7  0.1   0:00.79 cc1plus                                         
 23902 ubuntu    20   0  115744  75636   5016 R   3.7  0.1   0:00.69 cc1plus                                         
 23777 ubuntu    20   0  389124 158412  31104 S   3.3  0.3   0:02.19 rustc                                           
 23844 ubuntu    20   0   99220  64028   5012 R   3.3  0.1   0:00.64 cc1plus                                         
 23868 ubuntu    20   0   95020  61876   5036 R   3.3  0.1   0:00.67 cc1plus                                         
 23874 ubuntu    20   0  130676  93288   5016 R   3.3  0.2   0:00.86 cc1plus                                         
 23884 ubuntu    20   0  107624  71772   5016 R   3.3  0.1   0:00.72 cc1plus                                         
 23885 ubuntu    20   0  122096  81512   5016 R   3.3  0.1   0:00.74 cc1plus                                         
 23887 ubuntu    20   0  117800  80528   5016 R   3.3  0.1   0:00.76 cc1plus                                         
 23907 ubuntu    20   0   92836  59756   5012 R   3.3  0.1   0:00.62 cc1plus                                         
 23910 ubuntu    20   0  123028  90072   5040 R   3.3  0.1   0:01.07 cc1plus                                         
 23913 ubuntu    20   0  103304  69648   5036 R   3.3  0.1   0:00.77 cc1plus                                         
 23933 ubuntu    20   0   61008  25412   4532 R   3.3  0.0   0:00.11 cc1plus                                         
 23833 ubuntu    20   0   93688  58940   5008 R   3.0  0.1   0:00.58 cc1plus                                         
 23854 ubuntu    20   0   90736  56840   5036 R   3.0  0.1   0:00.57 cc1plus                                         
 23867 ubuntu    20   0  113680  75492   5012 R   3.0  0.1   0:00.70 cc1plus                                         
 23886 ubuntu    20   0   96976  61604   5016 R   3.0  0.1   0:00.64 cc1plus                                         
 23892 ubuntu    20   0   83536  52012   6936 R   3.0  0.1   0:00.60 cc1plus                                         
 23904 ubuntu    20   0  141760 101444   5020 R   3.0  0.2   0:00.95 cc1plus                                         
 23926 ubuntu    20   0   88060  48288   4948 R   3.0  0.1   0:00.31 cc1plus                                         
 23935 ubuntu    20   0   58992  23216   3628 R   3.0  0.0   0:00.10 cc1plus                                         
 23831 ubuntu    20   0  105168  66708   5000 R   2.7  0.1   0:00.58 cc1plus                                         
 23865 ubuntu    20   0  118376  84512   5012 R   2.7  0.1   0:00.87 cc1plus                                         
  1486 mongodb   20   0  179500  54148  23244 S   0.3  0.1   0:00.67 mongod                                          
  9654 ubuntu    20   0   23648   1664   1172 R   0.3  0.0   0:00.26 top                                             
     1 root      20   0   33492   2836   1444 S   0.0  0.0   0:00.38 init                                            
    41 root      20   0  273264  20036   8068 S   0.0  0.0   0:00.13 Xvfb                                            
    59 root      20   0   27792   1648   1012 S   0.0  0.0   0:00.04 mountall                                        
   337 root      20   0   15256    628    428 S   0.0  0.0   0:00.04 upstart-socket-  
tatsuya6502 commented 7 years ago

After some debugging in gcc-rs, I found $NUM_JOBS was overridden by Cargo. Using --jobs N option of cargo build seems to solve the problem.

e.g. https://circleci.com/gh/hibari/hibari-brick-rs/26

Command: rustup run stable cargo test --jobs 4 --release

ubuntu@box2362:~$ ps auxwww | egrep -c '[c]c1plus'
4

ubuntu@box2362:~$ top
Tasks:  55 total,   5 running,  50 sleeping,   0 stopped,   0 zombie
%Cpu(s): 33.3 us,  4.8 sy,  0.0 ni, 61.3 id,  0.3 wa,  0.0 hi,  0.2 si,  0.1 st
KiB Mem:  25190286+total, 88193712 used, 16370915+free,   336236 buffers
KiB Swap:        0 total,        0 used,        0 free. 53949096 cached Mem

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND  
 27028 ubuntu    20   0  324656 288836   9320 R  51.0  0.1   0:05.18 cc1plus
 27040 ubuntu    20   0  295888 257860   4868 R  48.9  0.1   0:03.54 cc1plus
 27036 ubuntu    20   0  326644 291584   9184 R  48.6  0.1   0:04.49 cc1plus
 27048 ubuntu    20   0   42316   8436   3580 R   0.9  0.0   0:00.03 cc1plus
 26508 ubuntu    20   0  103568   1712    756 S   0.3  0.0   0:00.01 sshd
     1 root      20   0   33668   2932   1444 S   0.0  0.0   0:00.50 init
    45 root      20   0  273600  20068   8092 S   0.0  0.0   0:00.15 Xvfb
   338 root      20   0   15256    624    432 S   0.0  0.0   0:00.05 upstart-socket-
...
tatsuya6502 commented 7 years ago

Circle CI job 26 (with cargo build --jobs 4) has passed. Also I opened a pull-request to gcc-rs with a minor doc enhancement.

Closing this issue.