heidsoft / cloud-bigdata-book

write book
56 stars 33 forks source link

jvm 调试 #45

Open heidsoft opened 6 years ago

heidsoft commented 6 years ago

DefNew 与 ParNew

HotSpot VM的开发历史

HotSpot VM的GC组老人之一Jon Masamitsu很久之前就写过blog讲解这个:https://blogs.oracle.com/jonthecollector/entry/our_collectors

简单来说,有这么多东西反映了HotSpot VM的开发历史和实现细节。我在写篇东西讲述这部分历史,哪天写完的话在这边也放个链接嗯。

DefNewGeneration是default new generation
ParNewGeneration是parallel new generation

原本HotSpot VM里没有并行GC,当时就只有NewGeneration;后来准备要加入young gen的并行GC,就把原本的NewGeneration改名为DefNewGeneration,然后把新加的并行版叫做ParNewGeneration。

这些XXXGeneration都在HotSpot VM的“分代式GC框架”内。本来HotSpot VM鼓励开发者尽量在这个框架内开发GC,但后来有个开发就是不愿意被这框架憋着,自己硬写了个没有使用已有框架的新并行GC,并拉拢性能测试团队用这个并行GC来跑分,成绩也还不错,于是这个GC就放进HotSpot VM里了。这就是我们现在看到的ParallelScavenge。

(结果就是HotSpot GC组不得不维护两个功能几乎一样、但各种具体细节不同的并行GC。其实是件很头疼的事情嗯)

Scavenge或者叫scavenging GC,其实就是copying GC的另一种叫法而已。HotSpot VM里的GC都是在minor GC收集器里用scavenging的,DefNew、ParNew和ParallelScavenge都是,只不过DefNew是串行的copying GC,而后两者是并行的copying GC。

由此名字就可以知道,“ParallelScavenge”的初衷就是把“scavenge”给并行化。换句话说就是把minor GC并行化。至于full GC,那不是当初关注的重点。

把GC并行化的目的是想提高GC速度,也就是提高吞吐量(throughput)。所以其实ParNew与ParallelScavenge都可叫做Throughput GC。
但是在HotSpot VM的术语里“Throughput GC”通常特指“ParallelScavenge”。

================================

ParallelScavenge和ParNew都是并行GC,主要是并行收集young gen,目的和性能其实都差不多。最明显的区别有下面几点:
1、PS以前是广度优先顺序来遍历对象图的,JDK6的时候改为默认用深度优先顺序遍历,并留有一个UseDepthFirstScavengeOrder参数来选择是用深度还是广度优先。在JDK6u18之后这个参数被去掉,PS变为只用深度优先遍历。ParNew则是一直都只用广度优先顺序来遍历
2、PS完整实现了adaptive size policy,而ParNew及“分代式GC框架”内的其它GC都没有实现完(倒不是不能实现,就是麻烦+没人力资源去做)。所以千万千万别在用ParNew+CMS的组合下用UseAdaptiveSizePolicy,请只在使用UseParallelGC或UseParallelOldGC的时候用它。
3、由于在“分代式GC框架”内,ParNew可以跟CMS搭配使用,而ParallelScavenge不能。当时ParNew GC被从Exact VM移植到HotSpot VM的最大原因就是为了跟CMS搭配使用。
4、在PS成为主要的throughput GC之后,它还实现了针对NUMA的优化;而ParNew一直没有得到NUMA优化的实现。

================================

还有一点要注意:上面说ParallelScavenge并行收集young gen,那old/perm gen呢?

其实最初的ParallelScavenge的目标只是并行收集young gen,而full GC的实际实现还是跟serial GC一样。只不过因为它没有用HotSpot VM的generational GC framework,自己实现了一个CollectedHeap的子类ParallelScavengeHeap,里面都弄了独立的一套接口,而跟HotSpot当时其它几个GC不兼容。其实真的有用的代码大部分就在PSScavenge(=“ParallelScavenge的Scavenge”)里,也就是负责minor GC的收集器;而负责full GC的收集器叫做PSMarkSweep(=“ParallelScavenge的MarkSweep”),其实只是在serial GC的核心外面套了层皮而已,骨子里是一样的LISP2算法的mark-compact收集器(别被名字骗了,它并不是一个mark-sweep收集器)。

当启用-XX:+UseParallelGC时,用的就是PSScavenge+PSMarkSweep的组合。
这是名副其实的“ParallelScavenge”——只并行化了“scavenge”。

所以其实非要说对应关系的话,PSScavenge才是真的跟ParNew对等的东西;ParallelScavenge这个名字既指代整套新GC,也可指代其真正卖点的PSScavenge。

不知道后来什么原因导致full GC的并行化并没有在原本的generational GC framework上进行,而只在ParallelScavenge系上进行了。其成果就是使用了LISP2算法的并行版的full GC收集器,名为PSCompact(=“ParallelScavenge-MarkCompact”),收集整个GC堆。

当启用-XX:+UseParallelOldGC时,用的就是PSScavenge+PSCompact的组合。
此时ParallelScavenge其实已经名不符实了——它不只并行化了“scavenge”(minor GC),也并行化了“mark-compact”(full GC)。

内存导出日志

#
# There is insufficient memory for the Java Runtime Environment to continue.
# Cannot create GC thread. Out of system resources.
# Possible reasons:
#   The system is out of physical RAM or swap space
#   In 32 bit mode, the process size limit was hit
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Check if swap backing store is full
#   Use 64 bit Java on a 64 bit OS
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
#  Out of Memory Error (gcTaskThread.cpp:48), pid=8624, tid=0x00007f5eca301700
#
# JRE version:  (8.0_101-b13) (build )
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.101-b13 mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#

---------------  T H R E A D  ---------------

Current thread (0x00000000013b5000):  JavaThread "Unknown thread" [_thread_in_vm, id=8624, stack(0x00007ffe60ba8000,0x00007ffe60ca8000)]

Stack: [0x00007ffe60ba8000,0x00007ffe60ca8000],  sp=0x00007ffe60ca2860,  free space=1002k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xac3f0a]  VMError::report_and_die()+0x2ba
V  [libjvm.so+0x4fbf9b]  report_vm_out_of_memory(char const*, int, unsigned long, VMErrorType, char const*)+0x8b
V  [libjvm.so+0x5d674f]  GCTaskThread::GCTaskThread(GCTaskManager*, unsigned int, unsigned int)+0x15f
V  [libjvm.so+0x5d550b]  GCTaskManager::initialize()+0x3ab
V  [libjvm.so+0x9457fd]  ParallelScavengeHeap::initialize()+0x34d
V  [libjvm.so+0xa8c753]  Universe::initialize_heap()+0xf3
V  [libjvm.so+0xa8c99e]  universe_init()+0x3e
V  [libjvm.so+0x63bdf5]  init_globals()+0x65
V  [libjvm.so+0xa70bfe]  Threads::create_vm(JavaVMInitArgs*, bool*)+0x23e
V  [libjvm.so+0x6d08f4]  JNI_CreateJavaVM+0x74
C  [libjli.so+0x745e]  JavaMain+0x9e
C  [libjli.so+0xb223]  ContinueInNewThread0+0x63
C  [libjli.so+0x697a]  ContinueInNewThread+0x7a
C  [libjli.so+0x99f8]  JLI_Launch+0x798
C  [jmap+0x6a3]
C  [libc.so.6+0x1ed1d]  __libc_start_main+0xfd

---------------  P R O C E S S  ---------------

Java Threads: ( => current thread )

Other Threads:

=>0x00000000013b5000 (exited) JavaThread "Unknown thread" [_thread_in_vm, id=8624, stack(0x00007ffe60ba8000,0x00007ffe60ca8000)]

VM state:not at safepoint (not fully initialized)

VM Mutex/Monitor currently owned by a thread: None

GC Heap History (0 events):
No events

Deoptimization events (0 events):
No events

Internal exceptions (0 events):
No events

Events (0 events):
No events

Dynamic libraries:
00400000-00401000 r-xp 00000000 ca:01 1188197                            /usr/java/jdk1.8.0_101/bin/jmap
00600000-00601000 rw-p 00000000 ca:01 1188197                            /usr/java/jdk1.8.0_101/bin/jmap
013ab000-013ef000 rw-p 00000000 00:00 0                                  [heap]
6c4c00000-6c5180000 rw-p 00000000 00:00 0 
6c5180000-76c400000 ---p 00000000 00:00 0 
76c400000-76c680000 rw-p 00000000 00:00 0 
76c680000-7c0000000 ---p 00000000 00:00 0 
331e000000-331e020000 r-xp 00000000 ca:01 262168                         /lib64/ld-2.12.so
331e220000-331e221000 r--p 00020000 ca:01 262168                         /lib64/ld-2.12.so
331e221000-331e222000 rw-p 00021000 ca:01 262168                         /lib64/ld-2.12.so
331e222000-331e223000 rw-p 00000000 00:00 0 
331e400000-331e58a000 r-xp 00000000 ca:01 262172                         /lib64/libc-2.12.so
331e58a000-331e78a000 ---p 0018a000 ca:01 262172                         /lib64/libc-2.12.so
331e78a000-331e78e000 r--p 0018a000 ca:01 262172                         /lib64/libc-2.12.so
331e78e000-331e790000 rw-p 0018e000 ca:01 262172                         /lib64/libc-2.12.so
331e790000-331e794000 rw-p 00000000 00:00 0 
331e800000-331e817000 r-xp 00000000 ca:01 262174                         /lib64/libpthread-2.12.so
331e817000-331ea17000 ---p 00017000 ca:01 262174                         /lib64/libpthread-2.12.so
331ea17000-331ea18000 r--p 00017000 ca:01 262174                         /lib64/libpthread-2.12.so
331ea18000-331ea19000 rw-p 00018000 ca:01 262174                         /lib64/libpthread-2.12.so
331ea19000-331ea1d000 rw-p 00000000 00:00 0 
331ec00000-331ec02000 r-xp 00000000 ca:01 262266                         /lib64/libdl-2.12.so
331ec02000-331ee02000 ---p 00002000 ca:01 262266                         /lib64/libdl-2.12.so
331ee02000-331ee03000 r--p 00002000 ca:01 262266                         /lib64/libdl-2.12.so
331ee03000-331ee04000 rw-p 00003000 ca:01 262266                         /lib64/libdl-2.12.so
331f000000-331f007000 r-xp 00000000 ca:01 262225                         /lib64/librt-2.12.so
331f007000-331f206000 ---p 00007000 ca:01 262225                         /lib64/librt-2.12.so
331f206000-331f207000 r--p 00006000 ca:01 262225                         /lib64/librt-2.12.so
331f207000-331f208000 rw-p 00007000 ca:01 262225                         /lib64/librt-2.12.so
331f400000-331f483000 r-xp 00000000 ca:01 262194                         /lib64/libm-2.12.so
331f483000-331f682000 ---p 00083000 ca:01 262194                         /lib64/libm-2.12.so
331f682000-331f683000 r--p 00082000 ca:01 262194                         /lib64/libm-2.12.so
331f683000-331f684000 rw-p 00083000 ca:01 262194                         /lib64/libm-2.12.so
7f5eb8647000-7f5eb88d1000 rw-p 00000000 00:00 0 
7f5eb88d1000-7f5eb8e0a000 ---p 00000000 00:00 0 
7f5eb8e0a000-7f5eb8e0d000 rw-p 00000000 00:00 0 
7f5eb8e0d000-7f5eb9346000 ---p 00000000 00:00 0 
7f5eb9346000-7f5eb9348000 rw-p 00000000 00:00 0 
7f5eb9348000-7f5eb95e4000 ---p 00000000 00:00 0 
7f5eb95e4000-7f5eb95ef000 rw-p 00000000 00:00 0 
7f5eb95ef000-7f5eb99a5000 ---p 00000000 00:00 0 
7f5eb99a5000-7f5eb9c15000 rwxp 00000000 00:00 0 
7f5eb9c15000-7f5ec89a5000 ---p 00000000 00:00 0 
7f5ec89a5000-7f5ec89bf000 r-xp 00000000 ca:01 1188363                    /usr/java/jdk1.8.0_101/jre/lib/amd64/libzip.so
7f5ec89bf000-7f5ec8bbf000 ---p 0001a000 ca:01 1188363                    /usr/java/jdk1.8.0_101/jre/lib/amd64/libzip.so
7f5ec8bbf000-7f5ec8bc0000 rw-p 0001a000 ca:01 1188363                    /usr/java/jdk1.8.0_101/jre/lib/amd64/libzip.so
7f5ec8bc0000-7f5ec8bcd000 r-xp 00000000 ca:01 262212                     /lib64/libnss_files-2.12.so
7f5ec8bcd000-7f5ec8dcc000 ---p 0000d000 ca:01 262212                     /lib64/libnss_files-2.12.so
7f5ec8dcc000-7f5ec8dcd000 r--p 0000c000 ca:01 262212                     /lib64/libnss_files-2.12.so
7f5ec8dcd000-7f5ec8dce000 rw-p 0000d000 ca:01 262212                     /lib64/libnss_files-2.12.so
7f5ec8dce000-7f5ec8dd6000 rw-s 00000000 ca:01 394038                     /tmp/hsperfdata_app/8624
7f5ec8dd6000-7f5ec8e00000 r-xp 00000000 ca:01 1188327                    /usr/java/jdk1.8.0_101/jre/lib/amd64/libjava.so
7f5ec8e00000-7f5ec9000000 ---p 0002a000 ca:01 1188327                    /usr/java/jdk1.8.0_101/jre/lib/amd64/libjava.so
7f5ec9000000-7f5ec9002000 rw-p 0002a000 ca:01 1188327                    /usr/java/jdk1.8.0_101/jre/lib/amd64/libjava.so
7f5ec9002000-7f5ec900f000 r-xp 00000000 ca:01 1188362                    /usr/java/jdk1.8.0_101/jre/lib/amd64/libverify.so
7f5ec900f000-7f5ec920f000 ---p 0000d000 ca:01 1188362                    /usr/java/jdk1.8.0_101/jre/lib/amd64/libverify.so
7f5ec920f000-7f5ec9211000 rw-p 0000d000 ca:01 1188362                    /usr/java/jdk1.8.0_101/jre/lib/amd64/libverify.so
7f5ec9211000-7f5ec9212000 ---p 00000000 00:00 0 
7f5ec9212000-7f5ec9312000 rw-p 00000000 00:00 0 
7f5ec9312000-7f5ec9fdd000 r-xp 00000000 ca:01 135664                     /usr/java/jdk1.8.0_101/jre/lib/amd64/server/libjvm.so
7f5ec9fdd000-7f5eca1dc000 ---p 00ccb000 ca:01 135664                     /usr/java/jdk1.8.0_101/jre/lib/amd64/server/libjvm.so
7f5eca1dc000-7f5eca2b5000 rw-p 00cca000 ca:01 135664                     /usr/java/jdk1.8.0_101/jre/lib/amd64/server/libjvm.so
7f5eca2b5000-7f5eca303000 rw-p 00000000 00:00 0 
7f5eca303000-7f5eca318000 r-xp 00000000 ca:01 270940                     /usr/java/jdk1.8.0_101/lib/amd64/jli/libjli.so
7f5eca318000-7f5eca518000 ---p 00015000 ca:01 270940                     /usr/java/jdk1.8.0_101/lib/amd64/jli/libjli.so
7f5eca518000-7f5eca519000 rw-p 00015000 ca:01 270940                     /usr/java/jdk1.8.0_101/lib/amd64/jli/libjli.so
7f5eca519000-7f5eca51a000 rw-p 00000000 00:00 0 
7f5eca51e000-7f5eca521000 rw-p 00000000 00:00 0 
7f5eca521000-7f5eca522000 r--p 00000000 00:00 0 
7f5eca522000-7f5eca523000 rw-p 00000000 00:00 0 
7ffe60ba8000-7ffe60bab000 ---p 00000000 00:00 0 
7ffe60bab000-7ffe60ca8000 rw-p 00000000 00:00 0                          [stack]
7ffe60dcf000-7ffe60dd0000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

VM Arguments:
jvm_args: -Dapplication.home=/usr/java/jdk1.8.0_101 -Xms8m -Dsun.jvm.hotspot.debugger.useProcDebugger -Dsun.jvm.hotspot.debugger.useWindbgDebugger 
java_command: sun.tools.jmap.JMap -histo 13665
java_class_path (initial): /usr/java/jdk1.8.0_101/lib/tools.jar:/usr/java/jdk1.8.0_101/lib/sa-jdi.jar:/usr/java/jdk1.8.0_101/classes
Launcher Type: SUN_STANDARD

Environment Variables:
PATH=/sbin:/bin:/usr/sbin:/usr/bin
USERNAME=app
SHELL=/bin/bash

Signal Handlers:
SIGSEGV: [libjvm.so+0xac4790], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGBUS: [libjvm.so+0xac4790], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGFPE: [libjvm.so+0x91f140], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGPIPE: [libjvm.so+0x91f140], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGXFSZ: [libjvm.so+0x91f140], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGILL: [libjvm.so+0x91f140], sa_mask[0]=11111111011111111101111111111110, sa_flags=SA_RESTART|SA_SIGINFO
SIGUSR1: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGUSR2: [libjvm.so+0x920770], sa_mask[0]=00000000000000000000000000000000, sa_flags=SA_RESTART|SA_SIGINFO
SIGHUP: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGINT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGTERM: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none
SIGQUIT: SIG_DFL, sa_mask[0]=00000000000000000000000000000000, sa_flags=none

---------------  S Y S T E M  ---------------

OS:CentOS release 6.9 (Final)

uname:Linux 2.6.32-696.10.1.el6.x86_64 #1 SMP Tue Aug 22 18:51:35 UTC 2017 x86_64
libc:glibc 2.12 NPTL 2.12 
rlimit: STACK 10240k, CORE 0k, NPROC 1024, NOFILE 262140, AS infinity
load average:1.85 1.54 1.67

/proc/meminfo:
MemTotal:       16463744 kB
MemFree:         4934204 kB
Buffers:             456 kB
Cached:            43060 kB
SwapCached:            0 kB
Active:         11227448 kB
Inactive:          14512 kB
Active(anon):   11198552 kB
Inactive(anon):      196 kB
Active(file):      28896 kB
Inactive(file):    14316 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:               184 kB
Writeback:             0 kB
AnonPages:      11199020 kB
Mapped:            35716 kB
Shmem:               196 kB
Slab:             102928 kB
SReclaimable:      68548 kB
SUnreclaim:        34380 kB
KernelStack:       37392 kB
PageTables:        31632 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     8231872 kB
Committed_AS:   13719136 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       40632 kB
VmallocChunk:   34359690300 kB
HardwareCorrupted:     0 kB
AnonHugePages:  10573824 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        6144 kB
DirectMap2M:    16771072 kB

CPU:total 4 (32 cores per cpu, 2 threads per core) family 6 model 62 stepping 4, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, aes, ht, tsc

/proc/cpuinfo:
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
stepping        : 4
microcode       : 1066
cpu MHz         : 2600.042
cache size      : 20480 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 1
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm
bogomips        : 5200.08
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
stepping        : 4
microcode       : 1066
cpu MHz         : 2600.042
cache size      : 20480 KB
physical id     : 2
siblings        : 1
core id         : 0
cpu cores       : 1
apicid          : 2
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm
bogomips        : 556.62
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
stepping        : 4
microcode       : 1066
cpu MHz         : 2600.042
cache size      : 20480 KB
physical id     : 4
siblings        : 1
core id         : 0
cpu cores       : 1
apicid          : 4
initial apicid  : 4
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm
bogomips        : 557.00
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
stepping        : 4
microcode       : 1066
cpu MHz         : 2600.042
cache size      : 20480 KB
physical id     : 6
siblings        : 1
core id         : 0
cpu cores       : 1
apicid          : 6
initial apicid  : 6
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt aes hypervisor lahf_lm
bogomips        : 572.39
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

Memory: 4k page, physical 16463744k(4934204k free), swap 0k(0k free)

vm_info: Java HotSpot(TM) 64-Bit Server VM (25.101-b13) for linux-amd64 JRE (1.8.0_101-b13), built on Jun 22 2016 02:59:44 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)

time: Sat Jun 23 23:41:41 2018
elapsed time: 0 seconds (0d 0h 0m 0s)
heidsoft commented 6 years ago

https://www.cnblogs.com/dragonsuc/p/6927824.html

heidsoft commented 6 years ago

http://hpjianhua.iteye.com/blog/1599991

heidsoft commented 6 years ago

https://www.jianshu.com/p/fff649585b78

heidsoft commented 6 years ago

http://xstarcd.github.io/wiki/Java/jmap_jstack.html

heidsoft commented 6 years ago

http://arganzheng.life/understanding-the-jvm.html

heidsoft commented 6 years ago

http://aandds.com/blog/java-jvm.html http://tacy.github.io/blog/2014/03/24/hotspot-gc/ http://ifeve.com/useful-jvm-flags-part-8-gc-logging/ https://blog.csdn.net/iycynna_123/article/details/64444211 https://segmentfault.com/a/1190000013509330 http://alaric.iteye.com/blog/2263682 https://my.oschina.net/7001/blog/870988 https://juejin.im/post/5a9b811a6fb9a028e46e1c88 https://www.jianshu.com/p/0522dc5aeba6 https://blog.csdn.net/sinat_25306771/article/details/52374498 https://www.jianshu.com/p/a65af5c2514b

heidsoft commented 6 years ago

https://www.sczyh30.com/page/8/

heidsoft commented 6 years ago

关于施用full gc频繁的分析及解决

分析
   当频繁full gc时,jstack打印出堆栈信息如下:
   sudo -u admin -H /opt/taobao/java/bin/jstack `pgrep java` > #your file path#

   可以看到的确是在跑低价信息
   另外在应用频繁full gc时和应用正常时,也执行了如下2种命令:
   sudo -u admin -H /opt/taobao/java/bin/jmap -histo `pgrep` > #your file path#
   sudo -u admin -H /opt/taobao/java/bin/jmap -histo:live `pgrep` > #your file path#(live会产生full gc)
   目的是确认以下2种信息:
   (1)是否存在某些引用的不正常,造成对象始终可达而无法回收(Java中的内存泄漏)
   (2)是否真是由于在频繁full gc时同时又有大量请求进入分配内存从而处理不过来,
        造成concurrent mode failure?
   下图是在应用正常情况下,jmap不加live,产生的histo信息:

   下图是在应用正常情况下,jmap加live,产生的histo信息:

   下图是在应用频繁full gc情况下,jmap不加live和加live,产生的histo信息:

   从上述几个图中可以看到:
   (1)在应用正常情况下,图中标红的对象是被回收的,因此不是内存泄漏问题
   (2)在应用频繁full gc时,标红的对象即使加live也是未被回收的,因上就是在频繁full gc时,
        同时又有大量请求进入分配内存从而处理不过来的问题
先从解决问题的角度,看怎样造成频繁的full gc?
从分析CMS GC开始
   先给个CMS GC的概况:
   (1)young gc
   可以看到,当eden满时,young gc使用的是ParNew收集器
   ParNew: 2230361K->129028K(2403008K), 0.2363650 secs解释:
   1)2230361K->129028K,指回收前后eden+s1(或s2)大小
   2)2403008K,指可用的young代的大小,即eden+s1(或s2)
   3)0.2363650 secs,指消耗时间
   2324774K->223451K(3975872K), 0.2366810 sec解释:
   1)2335109K->140198K,指整个堆大小的变化
(heap=(young+old)+perm;young=eden+s1+s2;s1=s2=young/(survivor ratio+2))
   2)0.2366810 sec,指消耗时间
   [Times: user=0.60 sys=0.02, real=0.24 secs]解释:指用户时间,系统时间,真实时间

   (2)cms gc
   当使用CMS收集器时,当开始进行收集时,old代的收集过程如下所示:
   a)首先jvm根据-XX:CMSInitiatingOccupancyFraction,-XX:+UseCMSInitiatingOccupancyOnly
     来决定什么时间开始垃圾收集
   b)如果设置了-XX:+UseCMSInitiatingOccupancyOnly,那么只有当old代占用确实达到了
     -XX:CMSInitiatingOccupancyFraction参数所设定的比例时才会触发cms gc
   c)如果没有设置-XX:+UseCMSInitiatingOccupancyOnly,那么系统会根据统计数据自行决定什么时候
    触发cms gc;因此有时会遇到设置了80%比例才cms gc,但是50%时就已经触发了,就是因为这个参数
    没有设置的原因
   d)当cms gc开始时,首先的阶段是CMS-initial-mark,此阶段是初始标记阶段,是stop the world阶段,
     因此此阶段标记的对象只是从root集最直接可达的对象
   CMS-initial-mark:961330K(1572864K),指标记时,old代的已用空间和总空间
   e)下一个阶段是CMS-concurrent-mark,此阶段是和应用线程并发执行的,所谓并发收集器指的就是这个,
     主要作用是标记可达的对象
   此阶段会打印2条日志:CMS-concurrent-mark-start,CMS-concurrent-mark
   f)下一个阶段是CMS-concurrent-preclean,此阶段主要是进行一些预清理,因为标记和应用线程是并发执行的,
    因此会有些对象的状态在标记后会改变,此阶段正是解决这个问题
   因为之后的Rescan阶段也会stop the world,为了使暂停的时间尽可能的小,也需要preclean阶段先做一部分
    工作以节省时间
   此阶段会打印2条日志:CMS-concurrent-preclean-start,CMS-concurrent-preclean
   g)下一阶段是CMS-concurrent-abortable-preclean阶段,加入此阶段的目的是使cms gc更加可控一些,
     作用也是执行一些预清理,以减少Rescan阶段造成应用暂停的时间
   此阶段涉及几个参数:
   -XX:CMSMaxAbortablePrecleanTime:当abortable-preclean阶段执行达到这个时间时才会结束
   -XX:CMSScheduleRemarkEdenSizeThreshold(默认2m):控制abortable-preclean阶段什么时候开始执行,
即当eden使用达到此值时,才会开始abortable-preclean阶段
   -XX:CMSScheduleRemarkEdenPenetratio(默认50%):控制abortable-preclean阶段什么时候结束执行
   此阶段会打印一些日志如下:
   CMS-concurrent-abortable-preclean-start,CMS-concurrent-abortable-preclean,
CMS:abort preclean due to time XXX
   h)再下一个阶段是第二个stop the world阶段了,即Rescan阶段,此阶段暂停应用线程,对对象进行重新扫描并
     标记
   YG occupancy:964861K(2403008K),指执行时young代的情况
   CMS remark:961330K(1572864K),指执行时old代的情况
   此外,还打印出了弱引用处理、类卸载等过程的耗时
   i)再下一个阶段是CMS-concurrent-sweep,进行并发的垃圾清理
   j)最后是CMS-concurrent-reset,为下一次cms gc重置相关数据结构

   (3)full gc:
   有2种情况会触发full gc,在full gc时,整个应用会暂停
   a)concurrent-mode-failure:当cms gc正进行时,此时有新的对象要进行old代,但是old代空间不足造成的
   b)promotion-failed:当进行young gc时,有部分young代对象仍然可用,但是S1或S2放不下,
    因此需要放到old代,但此时old代空间无法容纳此

频繁full gc的原因
   从日志中可以看出有大量的concurrent-mode-failure,因此正是当cms gc进行时,有新的对象要进行old代,
但是old代空间不足造成的full gc
   进程的jvm参数如下所示:

   影响cms gc时长及触发的参数是以下2个:
   -XX:CMSMaxAbortablePrecleanTime=5000
   -XX:CMSInitiatingOccupancyFraction=80
   解决也是针对这两个参数来的
   根本的原因是每次请求消耗的内存量过大
解决
   (1)针对cms gc的触发阶段,调整-XX:CMSInitiatingOccupancyFraction=50,提早触发cms gc,就可以
        缓解当old代达到80%,cms gc处理不完,从而造成concurrent mode failure引发full gc
   (2)修改-XX:CMSMaxAbortablePrecleanTime=500,缩小CMS-concurrent-abortable-preclean阶段
        的时间
   (3)考虑到cms gc时不会进行compact,因此加入-XX:+UseCMSCompactAtFullCollection
       (cms gc后会进行内存的compact)和-XX:CMSFullGCsBeforeCompaction=4
       (在full gc4次后会进行compact)参数
   但是运行了一段时间后,只不过时间更长了,又会出现频繁full gc
   计算了一下heap各个代的大小(可以用jmap -heap查看):
   total heap=young+old=4096m
   perm:256m
   young=s1+s2+eden=2560m
   young avail=eden+s1=2133.375+213.3125=2346.6875m
   s1=2560/(10+1+1)=213.3125m
   s2=s1
   eden=2133.375m
   old=1536m
   可以看到eden大于old,在极端情况下(young区的所有对象全都要进入到old时,就会触发full gc),
因此在应用频繁full gc时,很有可能old代是不够用的,因此想到将old代加大,young代减小
   改成以下:
   -Xmn1920m
   新的各代大小:
   total heap=young+old=4096m
   perm:256m
   young=s1+s2+eden=1920m
   young avail=eden+s1=2133.375+213.3125=1760m
   s1=1760/(10+1+1)=160m
   s2=s1
   eden=1600m
   old=2176m
   此时的eden小于old,可以缓解一些问题

   改完之后,运行了2天,问题解决,未频繁报full gc

https://my.oschina.net/goldwave/blog/168516

http://itindex.net/detail/56085-full-gc-%E5%88%86%E6%9E%90

heidsoft commented 5 years ago

docker jvm 垃圾收集

https://www.reddit.com/r/docker/comments/7u06xo/docker_and_the_jvm_garbage_collector/ https://dzone.com/articles/how-to-decrease-jvm-memory-consumption-in-docker-u https://jaxenter.com/nobody-puts-java-container-139373.html https://www.youtube.com/watch?v=sJ-_htmU0TE&feature=youtu.be&t=20m17s https://developers.redhat.com/blog/2017/03/14/java-inside-docker/

heidsoft commented 5 years ago

在Hotspot VM实现中,主要有两大类GC


Partial GC:并不会堆整个GC堆进行收集
young gc:只收集 young gen 的GC
old gc:只收集 old gen 的GC,只有CMS的 concurrent collection
mixed GC:收集整个 young gen 以及部分 old gen 的GC,只有G1
Full GC:收集整个堆,包括young gen、old gen、perm gen(如果存在的话)等
其实在各种文章或书上还可以看到Minor GC、Major GC的字眼,其中minor GC和young gc对应,而Major GC通常是和Full GC是等价的,由于HotSpot VM发展了这么多年,外界对各种名词的解读已经完全混乱了,所以Major GC有时也可能是指old gc,在下定论之前一定要先问清楚。

单线程、并行、并发
在GC收集器实现中,分为了单线程、并行和并发。
单线程收集器:如 Serial GC,这个比较好理解,即垃圾收集过程中只有单一线程在进行收集工作,实现也最简单。

并行收集器:如Parallel GC,每次运行时,不管是YGC,还是FGC,会 stop-the-world,暂停所有的用户线程,并采用多个线程同时进行垃圾收集。

并发收集器:如CMS GC,在新生代进行垃圾收集时和并行收集器类似,都是并行收集(当然具体算法中,你也可以设置成采用单线程进行收集),而且都会stop-the-world,主要的区别在于老年代的收集上,CMS在老年代进行垃圾收集时,大部分时间可以和用户线程并发执行的,只有小部分的时间stop-the-world,这就是它的优势,可以大大降低应用的暂停时间,当然也是有劣势的。

算法组合
Hotspot VM实现的几种GC算法组合中,其中CMS GC使用最广,因为现在都是大内存时代。

1、Serial GC
Serial generational collector (-XX:+UseSerialGC)
是全局范围的Full GC,这种算法组合是最早出现的,当年的Java堆内存大小都还不大,使用Serial GC进行单线程收集,还感觉不出来GC耗时导致应用暂停的问题

2、Parallel GC
Parallel for young space, serial for old space generational collector (-XX:+UseParallelGC).
Parallel for young and old space generational collector (-XX:+UseParallelOldGC)
当Java堆慢慢变大时,发现已经无法忍受GC耗时带来的应用暂停了,出现了Parallel GC,采用多线程的方式进行垃圾收集,很明显可以提升垃圾收集效率。

3、CMS GC
Concurrent mark sweep with serial young space collector (-XX:+UseConcMarkSweepGC
–XX:-UseParNewGC)
Concurrent mark sweep with parallel young space collector (-XX:+UseConcMarkSweepGC)
当Java堆达到更大时,比如8G,使用Parallel GC带来的应用暂停已经很明显了,所有又出现了 CMS GC,这是目前我看到线上环境使用的比较多的GC策略,在参数中添加-XX:+UseConcMarkSweepGC,对于 young gen,会自动选用 ParNewGC,不需要额外添加 -XX:+UseParNewGC。

CMS虽然好,因为它的特殊算法,大部分的收集过程可以和用户线程并发执行,大大降低应用的暂停时间,不过也会带来负面影响,在收集完 old gen 之后,CMS并不会做整理过程,会产生空间碎片,如果这些碎片空间得不到利用,就会造成空间的浪费,整个过程中可能发生 concurrent mode failure,导致一次真正意义的 full gc,采用单线程对整个堆(young+old+perm) 使用MSC(Mark-Sweep-Compact)进行收集,这个过程意味着很慢很慢很慢,而且这个碎片问题是无法预测的.

4、G1 GC
G1 garbage collector (-XX:+UseG1GC),本文不对G1进行介绍

触发条件
young gc
对于 young gc,触发条件似乎要简单很多,当 eden 区的内存不够时,就会触发young gc,我们看看在 eden 区给对象分配一块内存是怎样一个过程,画了一个简单的流程图,我一直觉得一个好的示意图可以让一个枯燥的过程变得更有意思。

在 eden 区分配空间内存不足时有两种情况,为对象分配内存、为TLAB分配内存,总之就是内存不够,需要进行一次 young gc 为eden区腾出空间为后续的内存申请做准备,然后由一个用户线程通知VM Thread,接下去要执行一次 young gc。

full gc
1、old gen 空间不足
当创建一个大对象、大数组时,eden 区不足以分配这么大的空间,会尝试在old gen 中分配,如果这时 old gen 空间也不足时,会触发 full gc,为了避免上述导致的 full gc,调优时应尽量让对象在 young gc 时就能够被回收,还有不要创建过大的对象和数组。

2、统计得到的 young gc 晋升到 old gen的对象平均总大小大于old gen 的剩余空间
当准备触发一次 young gc时,会判断这次 young gc 是否安全,这里所谓的安全是当前老年代的剩余空间可以容纳之前 young gc 晋升对象的平均大小,或者可以容纳 young gen 的全部对象,如果结果是不安全的,就不会执行这次 young gc,转而执行一次 full gc

3、perm gen 空间不足
如果有perm gen的话,当系统中要加载的类、反射的类和调用的方法较多,而且perm gen没有足够空间时,也会触发一次 full gc

4、ygc出现 promotion failure
promotion failure 发生在 young gc 阶段,即 cms 的 ParNewGC,当对象的gc年龄达到阈值时,或者 eden 的 to 区放不下时,会把该对象复制到 old gen,如果 old gen 空间不足时,会发生 promotion failure,并接下去触发full gc

在GC日志中,有时会看到 concurrent mode failure 关键字,这是因为什么原因导致的问题呢? 对这一块的理解,很多文章都是说因为 concurrent mode failure 导致触发full gc,其实应该反过来,是full gc 导致的 concurrent mode failure,在cms gc的算法实现中,通常说的cms是由一个后台线程定时触发的,默认每2秒检查一次old gen的内存使用率,当 old gen 的内存使用率达到-XX:CMSInitiatingOccupancyFraction设置的值时,会触发一次 cms gc,对 old gen 进行并发收集,而真正的 full gc 是通过 vm thread线程触发的,而且在判断当前ygc会失败的情况下触发full gc,如上一次ygc出现了promotion failure,如果执行 full gc 时,发现后台线程正在执行 cms gc,就会导致 concurrent mode failure。

对于以上这些情况,CMSInitiatingOccupancyFraction参数的设置就显得尤为重要,设置的太大的话,发生CMS时的剩余空间太小,在ygc的时候容易发生promotion failure,导致 concurrent mode failure 发生的概率就增大,如果设置太小的话,会导致 cms gc 的频率会增加,所以需要根据应用的需求对该参数进行调优。

5、执行 System.gc()、jmap -histo:live <pid>、jmap -dump ...
参考资料
Major GC和Full GC的区别是什么?触发条件呢

作者:占小狼
链接:https://www.jianshu.com/p/2750c7c202ef
來源:简书
简书著作权归作者所有,任何形式的转载都请联系作者获得授权并注明出处。
heidsoft commented 5 years ago

gc触发流程

heidsoft commented 5 years ago

生产参数调试

-XX:+UseParallelGC 开启此参数使用parallel scavenge & parallel old搜集器(server模式默认值) -XX:+UseParallelOldGC 开启此参数在年老代使用parallel old搜集器 -XX:ParallelGCThreads=4 回收时开启的线程数。默认与CPU个数相等。

heidsoft commented 5 years ago

https://stackoverflow.com/questions/6236726/whats-the-difference-between-parallelgc-and-paralleloldgc https://stackoverflow.com/questions/198577/real-differences-between-java-server-and-java-client http://developer.51cto.com/art/201009/228035.htm

heidsoft commented 5 years ago

jvm 源码下载

The repository is linked on http://openjdk.java.net/. Clone it and execute theget_source.sh script.

$ hg clone http://hg.openjdk.java.net/jdk8/jdk8
$ cd jdk8 && sh get_source.sh

hg clone http://hg.openjdk.java.net/jdk10/jdk10 && cd jdk10 && chmod +x get_source.sh && ./get_source.sh

hg clone http://hg.openjdk.java.net/jdk8/jdk8 && cd jdk8 && chmod +x get_source.sh && ./get_source.sh

hg clone http://hg.openjdk.java.net/jdk9/jdk9 && cd jdk9 && chmod +x get_source.sh && ./get_source.sh
heidsoft commented 5 years ago

jvm 命令参数

java -XshowSettings:vm -version 显示设置

heidsoft commented 5 years ago

jmap 使用

http://www.tianshouzhi.com/api/tutorials/jvm/99

heidsoft commented 5 years ago

https://stackoverflow.com/questions/13270933/what-happens-when-the-jvm-runs-out-of-memory-to-allocate-during-run-time

heidsoft commented 5 years ago

https://very-serio.us/2017/12/05/running-jvms-in-kubernetes/

heidsoft commented 5 years ago

https://www.alibabacloud.com/blog/kubernetes-demystified-restrictions-on-java-application-resources_594108

heidsoft commented 5 years ago

https://dzone.com/articles/kubernetes-demystified-restrictions-on-java-applic

heidsoft commented 5 years ago

jstat gcutil 的输出是什么意思

heidsoft commented 5 years ago

https://stackoverflow.com/questions/2129044/java-heap-terminology-young-old-and-permanent-generations https://juejin.im/post/5a9b811a6fb9a028e46e1c88 https://stackoverflow.com/questions/40672443/java-process-memory-usage-keeps-increasing-infinitely https://arhipov.blogspot.com/2011/01/java-bytecode-fundamentals.html https://www.toptal.com/java/hunting-memory-leaks-in-java

heidsoft commented 5 years ago

https://www.pushtechnology.com/support/kb/understanding-the-java-virtual-machine-heap-for-high-performance-applications/ https://codeahoy.com/2017/08/06/basics-of-java-garbage-collection/ http://onemogin.com/java/gc/java-gc-tuning-generational.html http://jprante.github.io/2012/11/28/Elasticsearch-Java-Virtual-Machine-settings-explained.html https://www.infoq.cn/article/Secrets-of-the-Bytecode-Ninjas

heidsoft commented 5 years ago

对外内存分析

https://gist.github.com/bossiernesto/ccb3a847e83ae0ddf7db0b0eae30870f https://xuxinkun.github.io/2016/05/16/memory-monitor-with-cgroup/ https://docs.docker.com/config/containers/runmetrics/ https://www.cnblogs.com/duanxz/p/10247494.html https://unix.stackexchange.com/questions/17936/setting-proc-sys-vm-drop-caches-to-clear-cache https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/resource_management_guide/sec-memory http://lovestblog.cn/blog/2016/07/20/jstat/ https://www.zybuluo.com/zero1036/note/872396 https://crunchify.com/jvm-tuning-heapsize-stacksize-garbage-collection-fundamental/ https://skorks.com/2010/03/how-to-quickly-generate-a-large-file-on-the-command-line-with-linux/ https://www.cyberciti.biz/faq/howto-create-lage-files-with-dd-command/ https://atbug.com/java8-metaspace-size-issue/ http://lovestblog.cn/blog/2016/10/29/metaspace/ https://www.cnblogs.com/yjd_hycf_space/p/7755633.html https://www.kernel.org/doc/Documentation/sysctl/vm.txt https://unix.stackexchange.com/questions/253816/restrict-size-of-buffer-cache-in-linux https://stackpointer.io/unix/linux-clear-memory-cache/403/ https://www.dynatrace.com/news/blog/how-to-identify-a-java-memory-leak/ https://www.journaldev.com/4098/java-heap-space-vs-stack-memory https://blogs.oracle.com/jonthecollector/presenting-the-permanent-generation https://betsol.com/2017/06/java-memory-management-for-java-virtual-machine-jvm/ https://stackoverflow.com/questions/31257968/how-to-access-jmx-interface-in-docker-from-outside https://www.quora.com/How-does-memory-management-work-in-Java https://zhanjindong.com/2016/03/02/jvm-memory-tunning-notes https://zhanjindong.com/2015/12/13/thinking-about-high-performance-web-service https://www.ibm.com/developerworks/linux/library/j-nativememory-linux/ https://stackoverflow.com/questions/38153381/how-to-debug-leak-in-native-memory-on-jvm https://www.baeldung.com/native-memory-tracking-in-jvm

heidsoft commented 5 years ago

https://www.cnblogs.com/peida/archive/2012/12/31/2840241.html http://s0docs0docker0com.icopy.site/config/containers/runmetrics/ https://www.cnblogs.com/youxin/p/4744652.html https://www.binarytides.com/linux-netstat-command-examples/ https://www.cyberciti.biz/tips/linux-investigate-sockets-network-connections.html https://www.cnblogs.com/duanxz/p/6115722.html https://docs.oracle.com/en/java/javase/12/vm/native-memory-tracking.html#GUID-710CAEA1-7C6D-4D80-AB0C-B0958E329407 https://qsli.github.io/2017/12/02/google-perf-tools/ https://coldwalker.com/2018/08//troubleshooter_native_memory_increase/ https://yq.aliyun.com/articles/657790 https://www.bbsmax.com/R/MyJxYAmM5n/ https://cloud.tencent.com/developer/article/1176832 https://www.cnblogs.com/zhaoyl/p/5515317.html https://blog.csdn.net/ma_mxr/article/details/87686922 https://kkewwei.github.io/elasticsearch_learning/2016/12/20/gdb%E8%B0%83%E8%AF%95java%E5%9F%BA%E6%9C%AC%E7%94%A8%E6%B3%95/ https://gist.github.com/miguno/548cd72eaec017c475448cb9b2ced258 https://stackoverflow.com/questions/6637448/how-to-find-the-address-of-a-string-in-memory-using-gdb http://openinx.github.io/2019/02/23/netty-memory-management/ https://caorong.github.io/2016/08/27/netty-hole/ https://stackoverflow.com/questions/41300520/what-is-locked-ownable-synchronizers-in-thread-dump https://blog.csdn.net/ztguang/article/details/51015758 https://fangjian0423.github.io/2016/06/04/java-thread-state/ https://blog.jrwang.me/2016/java-thread-states/ https://www.cnblogs.com/charlieroro/p/10180827.html