OpenHFT / Chronicle-Queue

Micro second messaging that stores everything to disk
http://chronicle.software/products/chronicle-queue/
Apache License 2.0
3.29k stars 530 forks source link

JVM crash A fatal error net.openhft.chronicle.queue.impl.single.SCQIndexing.setPositionForSequenceNumber #486

Closed gs80140 closed 6 years ago

gs80140 commented 6 years ago

#

A fatal error has been detected by the Java Runtime Environment:

#

SIGSEGV (0xb) at pc=0x00007fa28d7af9ab, pid=64081, tid=140295741994752

#

JRE version: Java(TM) SE Runtime Environment (8.0_60-b27) (build 1.8.0_60-b27)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.60-b23 mixed mode linux-amd64 )

Problematic frame:

J 28222 C2 net.openhft.chronicle.queue.impl.single.SCQIndexing.setPositionForSequenceNumber(Lnet/openhft/chronicle/queue/impl/single/StoreRecovery;Lnet/openhft/chronicle/queue/impl/ExcerptContext;JJ)V (351 bytes) @ 0x00007fa28d7af9ab [0x00007fa28d7af660+0x34b]

#

Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again

#

If you would like to submit a bug report, please visit:

http://bugreport.java.com/bugreport/crash.jsp

#

--------------- T H R E A D ---------------

Current thread (0x00007fa27021f000): JavaThread "Shuffle-18325" [_thread_in_Java, id=7096, stack(0x00007f9925d97000,0x00007f9925dd8000)]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x00007f99020201e0

Registers: RAX=0x00007f9925c5d000, RBX=0x00007fa21766ca00, RCX=0x00007fa21766be70, RDX=0x00007f99269d8978 RSP=0x00007f9925dd6380, RBP=0x00007f99020201e0, RSI=0x00007f99020201d0, RDI=0x00007fa21766ca00 R8 =0x000000000007ba95, R9 =0x00007f9a4c554ce0, R10=0x00007f99020201d0, R11=0x0000000000000004 R12=0x00007fa27021f000, R13=0x00007f99020201d0, R14=0x00007f9a447bd888, R15=0x00007fa27021f000 RIP=0x00007fa28d7af9ab, EFLAGS=0x0000000000010246, CSGSFS=0x0000000000000033, ERR=0x0000000000000007 TRAPNO=0x000000000000000e

Top of Stack: (sp=0x00007f9925dd6380) 0x00007f9925dd6380: 00007fa21766c528 00000000000201b0 0x00007f9925dd6390: 00007f9925dd63e0 0000000000000040 0x00007f9925dd63a0: 000000000007ba95 00007fa013f45e38 0x00007f9925dd63b0: 00007fa21766cab0 00007fa21766ca00 0x00007f9925dd63c0: 00007f9a43b395b8 00007fa28f157e30 0x00007f9925dd63d0: 00007fa21766c470 00007fa28d8e73a0 0x00007f9925dd63e0: 00007fa013f46cc0 00007fa28e25a5e4 0x00007f9925dd63f0: 00007f9ff005eda8 00007f9ff005e648 0x00007f9925dd6400: 00007fa013f45f00 00007fa28efe5f8c 0x00007f9925dd6410: 00007fa013f45f00 00007fa21766c7e0 0x00007f9925dd6420: 00007fa27021f000 00007f9a4c5552f0 0x00007f9925dd6430: 0000000000000000 00007fa27021f000 0x00007f9925dd6440: 00007f9925dd6460 00007fa29d8c6033 0x00007f9925dd6450: 000000005b0e0fcf 00000000000bf73d 0x00007f9925dd6460: 0000000000000000 00007fa28e0ca194 0x00007f9925dd6470: 00007f9ff005edc8 00007fa013f45f00 0x00007f9925dd6480: 00007fa21766c938 0000000100000000 0x00007f9925dd6490: 00007fa21766c730 000000000007ba9c 0x00007f9925dd64a0: 0000000000000000 000000000007ba99 0x00007f9925dd64b0: 00007f9ff005bd90 00007fa289d8e594 0x00007f9925dd64c0: 00007f9f00000000 0000000000000000 0x00007f9925dd64d0: 0000001f00000020 0000002043c99cd8 0x00007f9925dd64e0: 000000000000001f 00007fa013f447b0 0x00007f9925dd64f0: 00007fa013f44b98 00007fa28d828e20 0x00007f9925dd6500: 00007fa013f447b0 00007f9ff005de08 0x00007f9925dd6510: 0000002000000020 00007f9ff005bdc0 0x00007f9925dd6520: 00007f9a00000001 00007f9f0000001f 0x00007f9925dd6530: 00007f9ff19d2698 0000000100000000 0x00007f9925dd6540: 0000000000000009 00007fa27021f000 0x00007f9925dd6550: 00007f9a43c99cd8 00007fa28f0a5dec 0x00007f9925dd6560: 00007fa013f447b0 00007f9aac210cf0 0x00007f9925dd6570: 0000000100000001 0000002100000000

Instructions: (pc=0x00007fa28d7af9ab) 0x00007fa28d7af98b: 10 4d 8b 41 08 48 ba 78 89 9d 26 99 7f 00 00 4c 0x00007fa28d7af99b: 3b c2 0f 85 de 06 00 00 4c 8b d6 4c 8b 44 24 20 0x00007fa28d7af9ab: 4d 89 42 10 4c 8b 54 24 08 49 83 c2 08 41 ff c3 0x00007fa28d7af9bb: 4d 63 c3 4c 8b 5f 08 49 b9 e0 b2 be 25 99 7f 00

Register to memory mapping:

RAX=0x00007f9925c5d000 is pointing into metadata RBX=0x00007fa21766ca00 is an oop net.openhft.chronicle.bytes.MappedBytes

Stack: [0x00007f9925d97000,0x00007f9925dd8000], sp=0x00007f9925dd6380, free space=252k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) J 28222 C2 net.openhft.chronicle.queue.impl.single.SCQIndexing.setPositionForSequenceNumber(Lnet/openhft/chronicle/queue/impl/single/StoreRecovery;Lnet/openhft/chronicle/queue/impl/ExcerptContext;JJ)V (351 bytes) @ 0x00007fa28d7af9ab [0x00007fa28d7af660+0x34b] J 36314 C2 net.openhft.chronicle.queue.impl.single.SingleChronicleQueueExcerpts$StoreAppender$StoreAppenderContext.close()V (414 bytes) @ 0x00007fa28efe5f8c [0x00007fa28efe5aa0+0x4ec]

JerryShea commented 6 years ago

In order to investigate this, we will need some more information. Can you provide a unit test which reproduces the problem? Which version of queue are you running?

tangguoqiang commented 6 years ago

I have the same issue, the chronicle queue version is '4.5.27', jdk version is 'jdk1.8.0_60'. Below is the detail :

#

A fatal error has been detected by the Java Runtime Environment:

#

SIGSEGV (0xb) at pc=0x00007f6a81c5af20, pid=25375, tid=140054825572096

#

JRE version: Java(TM) SE Runtime Environment (8.0_60-b27) (build 1.8.0_60-b27)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.60-b23 mixed mode linux-amd64 )

Problematic frame:

J 27870 C2 net.openhft.chronicle.queue.impl.single.SCQIndexing.setPositionForSequenceNumber(Lnet/openhft/chronicle/queue/impl/single/StoreRecovery;Lnet/openhft/chronicle/queue/impl/ExcerptContext;JJ)V (351 bytes) @ 0x00007f6a81c5af20 [0x00007f6a81c5a9a0+0x580]

#

Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again

#

If you would like to submit a bug report, please visit:

http://bugreport.java.com/bugreport/crash.jsp

#

--------------- T H R E A D ---------------

Current thread (0x00007f6a7422b800): JavaThread "Shuffle-4160" [_thread_in_Java, id=26476, stack(0x00007f610e1c9000,0x00007f610e20a000)]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x00007f60ee0201e0

Registers: RAX=0x00007f64cb875ba0, RBX=0x00007f6240a291d0, RCX=0x00007f611b4907c8, RDX=0x00007f64cb875ba0 RSP=0x00007f610e207cb0, RBP=0x00007f60ee0201e0, RSI=0x00007f60ee0201d0, RDI=0x00007f611b80b4d0 R8 =0x00007f64cb875ba0, R9 =0x00007f611b80b4d0, R10=0x00007f60ee0201d0, R11=0x000000000007bbfb R12=0x00007f611aea9730, R13=0x00007f60ee0201d0, R14=0x0000000000000000, R15=0x00007f6a7422b800 RIP=0x00007f6a81c5af20, EFLAGS=0x0000000000010246, CSGSFS=0x0000000000000033, ERR=0x0000000000000007 TRAPNO=0x000000000000000e

Top of Stack: (sp=0x00007f610e207cb0) 0x00007f610e207cb0: 00007f64d00035b8 00000000000201b0 0x00007f610e207cc0: 00007f6a7422b800 0000000000000040 0x00007f610e207cd0: 000000000007bbfb 00007f6400000004 0x00007f610e207ce0: 00007f6a0a8019c0 00000000000201a0 0x00007f610e207cf0: 00007f64dc82a8c8 00007f64dc802628 0x00007f610e207d00: 00007f64dc82ade8 00007f6a7422b800 0x00007f610e207d10: 00007f610e207d60 00007f6a7d487fea 0x00007f610e207d20: 00007f6238000818 00007f64dc802628 0x00007f610e207d30: 00007f64ccf47c00 00007f6a8002e93c 0x00007f610e207d40: 00007f64ccf47c00 00007f64dc82a8c8 0x00007f610e207d50: 00007f611aea9730 00007f6240a311e8 0x00007f610e207d60: 0000000000000000 00007f6a7422b800 0x00007f610e207d70: 00007f610e207d90 00007f6a93de0033 0x00007f610e207d80: 000000005b138544 000000000005a97c 0x00007f610e207d90: 0000000000000000 00007f6a8218b988 0x00007f610e207da0: 00007f6a0a801030 00007f64ccf47c00 0x00007f610e207db0: 00007f64ccf47998 0000000100000000 0x00007f610e207dc0: 00007f64dc82a870 000000000007bc02 0x00007f610e207dd0: 0000000000000000 000000000007bbff 0x00007f610e207de0: 0000000000000010 0000000000000000 0x00007f610e207df0: 00007f629d883c60 00007f64dc802708 0x00007f610e207e00: 00007f6616066958 00007f66160fd570 0x00007f610e207e10: 00007f6600000000 00007f64dc8026a8 0x00007f610e207e20: 00007f6238de13a0 00007f6a7e2e9e3c 0x00007f610e207e30: 00007f64dc82b000 0000001f00000020 0x00007f610e207e40: 000000203de97316 00007f64dc802708 0x00007f610e207e50: 0000001f00000020 0000000000000020 0x00007f610e207e60: 00007f62a576f360 00007f64dc802628 0x00007f610e207e70: 00007f64dc82afa0 0000000000000000 0x00007f610e207e80: 00007f6238dc3ff8 0000002000000020 0x00007f610e207e90: 00007f64dc802708 00007f660000001f 0x00007f610e207ea0: 0000000000000000 00007f66160fb628

Instructions: (pc=0x00007f6a81c5af20) 0x00007f6a81c5af00: 10 4c 8b 4b 08 48 bf d0 b4 80 1b 61 7f 00 00 4c 0x00007f6a81c5af10: 3b cf 0f 85 89 06 00 00 4d 8b d5 4c 8b 5c 24 20 0x00007f6a81c5af20: 4d 89 5a 10 4c 8b 54 24 08 49 83 c2 08 44 8b 5c 0x00007f6a81c5af30: 24 28 41 ff c3 4d 63 c3 4c 8b 5a 08 49 b9 50 81

Register to memory mapping:

RAX=0x00007f64cb875ba0 is an oop net.openhft.chronicle.bytes.MappedBytes

Stack: [0x00007f610e1c9000,0x00007f610e20a000], sp=0x00007f610e207cb0, free space=251k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) J 27870 C2 net.openhft.chronicle.queue.impl.single.SCQIndexing.setPositionForSequenceNumber(Lnet/openhft/chronicle/queue/impl/single/StoreRecovery;Lnet/openhft/chronicle/queue/impl/ExcerptContext;JJ)V (351 bytes) @ 0x00007f6a81c5af20 [0x00007f6a81c5a9a0+0x580] J 18934 C2 net.openhft.chronicle.queue.impl.single.SingleChronicleQueueExcerpts$StoreAppender$StoreAppenderContext.close()V (414 bytes) @ 0x00007f6a8002e93c [0x00007f6a8002e460+0x4dc] J 31992 C2 com.jnj.adf.curation.DiskBlockQueue.offer(Ljava/lang/Object;JLjava/util/concurrent/TimeUnit;)Z (12 bytes) @ 0x00007f6a8218b988 [0x00007f6a8218afa0+0x9e8] J 25068 C1 com.jnj.adf.connector.adf.client.DataStreamResultCollector.endResults()V (68 bytes) @ 0x00007f6a7e2e9e3c [0x00007f6a7e2e9ca0+0x19c] J 33839 C2 com.gemstone.gemfire.cache.client.internal.ExecuteRegionFunctionSingleHopOp.execute(Lcom/gemstone/gemfire/cache/client/internal/ExecutablePool;Lcom/gemstone/gemfire/cache/Region;Ljava/lang/String;Lcom/gemstone/gemfire/internal/cache/execute/ServerRegionFunctionExecutor;Lcom/gemstone/gemfire/cache/execute/ResultCollector;BLjava/util/Map;IZZZ)V (211 bytes) @ 0x00007f6a825fc784 [0x00007f6a825fc500+0x284] J 25230 C1 com.gemstone.gemfire.cache.client.internal.ServerRegionProxy.executeFunction(Ljava/lang/String;Ljava/lang/String;Lcom/gemstone/gemfire/internal/cache/execute/ServerRegionFunctionExecutor;Lcom/gemstone/gemfire/cache/execute/ResultCollector;BZZZ)V (349 bytes) @ 0x00007f6a7eb2b6cc [0x00007f6a7eb29fe0+0x16ec] J 33962 C2 com.gemstone.gemfire.internal.cache.execute.ServerRegionFunctionExecutor.executeOnServer(Ljava/lang/String;Lcom/gemstone/gemfire/cache/execute/ResultCollector;BZZ)Lcom/gemstone/gemfire/cache/execute/ResultCollector; (98 bytes) @ 0x00007f6a81749940 [0x00007f6a817496e0+0x260] J 33963 C2 com.gemstone.gemfire.internal.cache.execute.ServerRegionFunctionExecutor.executeFunction(Ljava/lang/String;ZZZ)Lcom/gemstone/gemfire/cache/execute/ResultCollector; (157 bytes) @ 0x00007f6a816c9860 [0x00007f6a816c96e0+0x180] J 25226 C1 com.gemstone.gemfire.internal.cache.execute.ServerRegionFunctionExecutor.execute(Ljava/lang/String;)Lcom/gemstone/gemfire/cache/execute/ResultCollector; (131 bytes) @ 0x00007f6a7ef98a2c [0x00007f6a7ef980e0+0x94c] J 32294 C2 org.springframework.data.gemfire.function.execution.ADFFunctionTemplate.doRemoteCall(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;[Ljava/lang/Class;[Ljava/lang/Object;)Ljava/lang/Object; (584 bytes) @ 0x00007f6a8213f48c [0x00007f6a8213dbc0+0x18cc] J 20558 C1 org.springframework.data.gemfire.function.execution.ADFFunctionTemplate.remoteCall(Ljava/lang/String;Ljava/lang/reflect/Method;[Ljava/lang/Object;)Ljava/lang/Object; (126 bytes) @ 0x00007f6a7f6d8a94 [0x00007f6a7f6d8620+0x474] J 33833 C1 com.jnj.adf.config.CglibRemoteServiceProxy.intercept(Ljava/lang/Object;Ljava/lang/reflect/Method;[Ljava/lang/Object;Lorg/springframework/cglib/proxy/MethodProxy;)Ljava/lang/Object; (365 bytes) @ 0x00007f6a826ec784 [0x00007f6a826eb000+0x1784] J 27546 C1 com.jnj.adf.connector.adf.client.IAdfDataStreamService$$EnhancerByCGLIB$$ab537d4a.getRegionDataByQuery(Ljava/lang/String;[Ljava/lang/String;Ljava/lang/String;[BILcom/gemstone/gemfire/cache/execute/ResultCollector;)V (88 bytes) @ 0x00007f6a7fc59cd4 [0x00007f6a7fc59220+0xab4] J 27089 C1 com.jnj.adf.curation.CurationUtils$5.run()V (42 bytes) @ 0x00007f6a819fd854 [0x00007f6a819fce80+0x9d4] J 33843 C2 java.util.concurrent.FutureTask.run()V (126 bytes) @ 0x00007f6a7e68a140 [0x00007f6a7e689ee0+0x260] J 9990 C2 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V (225 bytes) @ 0x00007f6a7e94ddfc [0x00007f6a7e94dcc0+0x13c] J 18603 C2 java.lang.Thread.run()V (17 bytes) @ 0x00007f6a7d190f68 [0x00007f6a7d190f20+0x48] v ~StubRoutines::call_stub V [libjvm.so+0x68bbe6] JavaCalls::call_helper(JavaValue, methodHandle, JavaCallArguments, Thread)+0x1056 V [libjvm.so+0x68c0f1] JavaCalls::call_virtual(JavaValue, KlassHandle, Symbol, Symbol, JavaCallArguments, Thread)+0x321 V [libjvm.so+0x68c597] JavaCalls::call_virtual(JavaValue, Handle, KlassHandle, Symbol, Symbol, Thread)+0x47 V [libjvm.so+0x7232d0] thread_entry(JavaThread, Thread)+0xa0 V [libjvm.so+0xa68f3f] JavaThread::thread_main_inner()+0xdf V [libjvm.so+0xa6906c] JavaThread::run()+0x11c V [libjvm.so+0x91cb88] java_start(Thread)+0x108 C [libpthread.so.0+0x7dc5] start_thread+0xc5

gs80140 commented 6 years ago

It is in our production environment , hard to have code to reproduce , it just reproduced in production environment. I am trying to reproduce in local environment too...

peter-lawrey commented 6 years ago

I suggest upgrading to Java 8 update 171 and use Chronicle Queue 4.15.1 which is the latest. If you would like help migrating queue to the latest version, or commercial support, please contact sales@chronicle.software

tangguoqiang commented 6 years ago

Do you have any idea about this issue?

saint-cygnum commented 6 years ago

@peter-lawrey, other than the CPU/PSU/SPU differences, is there any reason why you have advised to use JDK 171?

edit: Latest version I can see from the Oracle download page is 172.

Thanks.

RobAustin commented 6 years ago

@gs80140 where you able to reproduce this in your dev env, can you also retest with the latest version of chronicle queue, thanks.

RobAustin commented 6 years ago

closing as no response for 2 weeks from @gs80140