Open szilard opened 5 years ago
x1e.8xlarge (32 cores, 1 NUMA, 960GB RAM)
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E7-8880 v3 @ 2.30GHz
Stepping: 4
CPU MHz: 2699.984
CPU max MHz: 3100.0000
CPU min MHz: 1200.0000
BogoMIPS: 4600.10
Hypervisor vendor: Xen
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 46080K
NUMA node0 CPU(s): 0-31
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq monitor est ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single kaiser fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm xsaveopt ida
~/spark-2.4.2-bin-hadoop2.7/bin/spark-shell --master local[*] --driver-memory 940G --executor-memory 940G
scala> val model = pipeline.fit(d_train)
[Stage 443:> (0 + 32) / 32]OpenJDK
64-Bit Server VM warning:
INFO: os::commit_memory(0x00007eb838e80000, 51384942592, 0) failed;
error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 51384942592 bytes
for committing reserved memory.
# An error report file with more information is saved as:
# /home/ubuntu/GBM-perf/wip-testing/spark/hs_err_pid2301.log
Let's try to learn only 1 tree of depth 1:
runs 1150 sec, AUC=0.634, RAM usage 620GB
1 tree depth 10:
runs 1350 sec, AUC=0.712, RAM usage 620GB
10 trees depth 10:
runs 7850 sec, AUC=0.731, RAM usage 780GB
100M | 10M | ||||||
---|---|---|---|---|---|---|---|
trees | depth | time [s] | AUC | RAM [GB] | time [s] | AUC | RAM [GB] |
1 | 1 | 1150 | 0.634 | 620 | 70 | 0.635 | 110 |
1 | 10 | 1350 | 0.712 | 620 | 90 | 0.712 | 112 |
10 | 10 | 7850 | 0.731 | 780 | 830 | 0.731 | 125 |
100 | 10 | crash OOM | >960 (OOM) | 8070 | 0.755 | 230 |
100M ran on: x1e.8xlarge (32 cores, 1 NUMA, 960GB RAM)
10M ran on: r4.8xlarge (32 cores, 1 NUMA, 240GB RAM)