Closed vetss closed 6 years ago
Hi @vetss . I think the Exception is generated at https://github.com/RestComm/jain-slee/blob/master/container/activities/src/main/java/org/mobicents/slee/runtime/activity/ActivityContextImpl.java#L163.
Currently I am don't know why cacheData gets null here. It would be helpful if you could provide more log for further investigation.:) if you agree with this, We could check "if cacheData is null then return null before invoking getObject fucntion" as a workaround solution for this issue. If it is ok. I could create a workaround fix for it. The root cause should be investigated later
Regards Anh Dao
hi @vetss do you have any idea why this problem happen, can you please attach server.log for investigating?
Hi . I would like to update status on this issue. Please ignore my previous comments. I used incorrect version of ActivityContextImpl.java then it took me go with wrong analysis.
The exception must be generated by https://github.com/RestComm/jain-slee/blob/master/container/activities/src/main/java/org/mobicents/slee/runtime/activity/ActivityContextImpl.java#L167. Not line 163 as I mentioned before. I did revert to commit "d42bdd6" to make line number as the same as called stack put from @vetss and realize that I am wrong.
156 public int getActivityFlags() { 157 if (flags == null) { 158 // instance has no flags stored, check local ac 159 if (localActivityContext != null) { 160 flags = localActivityContext.getActivityFlags(); 161 } else { 162 // local ac does not exists, get from cache 163 flags = (Integer) cacheData 164 .getObject(NODE_MAP_KEY_ACTIVITY_FLAGS); 165 } 166 } 167 return flags; 168 }
Some conclusions:
Next I will try setup cluster to see if I can find more thing on this issue.
Status update:.
I am successful to preproduce this problem by using google-talk-bot example, Jconsole, and add some sleep (delay) function in slee and xmpp RA
File modified:
xmpp.patch file I added some delay to avoid xmpp send some init handshake message. Then I have 20 seconds to trigger queryActivityContextLiveness from Jconsole core.patch file -> I add some sleep after create activitycontexthandle in Datacache and pause 20 seconds before add NODE_MAP_KEY_ACTIVITY_FLAGS keys in Cache.
Steps Step 1: Deploy google-talk-bot example. Wait until log message "duyanh debug ActivityContextImpl sleeping 20000" Step2: Quickly invoke queryActivityContextLiveness operation from Jconsole. Then this issue will be 100% reproducible
log file nullPointerException.txt.tar.gz
Called Stack
2017-05-06 11:28:21,336 ERROR [org.mobicents.slee.runtime.activity.ActivityContextFactoryImpl] (RMI TCP Connection(2)-127.0.0.1) Failed to load AC.: java.lang.NullPointerException at org.mobicents.slee.runtime.activity.ActivityContextImpl.getActivityFlags(ActivityContextImpl.java:178) at org.mobicents.slee.runtime.activity.ActivityContextImpl.
(ActivityContextImpl.java:145) at org.mobicents.slee.runtime.activity.ActivityContextFactoryImpl.getActivityContext(ActivityContextFactoryImpl.java:192) at org.mobicents.slee.runtime.activity.ActivityContextFactoryImpl.getActivityContext(ActivityContextFactoryImpl.java:184) at org.mobicents.slee.container.management.jmx.ActivityManagementMBeanImpl$PeriodicLivelinessScanner.queryLiveness(ActivityManagementMBeanImpl.java:496) at org.mobicents.slee.container.management.jmx.ActivityManagementMBeanImpl$PeriodicLivelinessScanner.run(ActivityManagementMBeanImpl.java:520) at org.mobicents.slee.container.management.jmx.ActivityManagementMBeanImpl.queryActivityContextLiveness(ActivityManagementMBeanImpl.java:162) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at org.jboss.as.jmx.PluggableMBeanServerImpl$TcclMBeanServer.invoke(PluggableMBeanServerImpl.java:1503) at org.jboss.as.jmx.PluggableMBeanServerImpl.invoke(PluggableMBeanServerImpl.java:724) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829) at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
The called stack I preproduced have a little bit difference compared with the Author's one. Because I trigger "queryActivityContextLiveness" manually from jconson while It's called by scheduler from called stack of author.
Conclusion The null pointer is generated by https://github.com/RestComm/jain-slee/blob/master/container/activities/src/main/java/org/mobicents/slee/runtime/activity/ActivityContextImpl.java#L167. When they try to get NODE_MAP_KEY_ACTIVITY_FLAGS from Cache:
flags = (Integer) cacheData.getObject(NODE_MAP_KEY_ACTIVITY_FLAGS);
and null pointer returned.... The reason is that: we have another thread just create New Node in Cache by https://github.com/RestComm/jain-slee/blob/master/container/activities/src/main/java/org/mobicents/slee/runtime/activity/ActivityContextImpl.java#L107 and still not invoke line 108 to add "NODE_MAP_KEY_ACTIVITY_FLAGS key" into Cache Data. This issue is rarely occured. Solution Discussion: I suggest We could:
Closing. Added return value check
I faced with an Exception at SMSC GW:
2017-04-10 05:53:18,164 ERROR [org.mobicents.slee.runtime.activity.ActivityContextFactoryImpl] (pool-27-thread-1) Failed to load AC. java.lang.NullPointerException at org.mobicents.slee.runtime.activity.ActivityContextImpl.getActivityFlags(ActivityContextImpl.java:166) at org.mobicents.slee.runtime.activity.ActivityContextImpl.(ActivityContextImpl.java:133)
at org.mobicents.slee.runtime.activity.ActivityContextFactoryImpl.getActivityContext(ActivityContextFactoryImpl.java:186)
at org.mobicents.slee.runtime.activity.ActivityContextFactoryImpl.getActivityContext(ActivityContextFactoryImpl.java:178)
at org.mobicents.slee.container.management.jmx.ActivityManagementMBeanImpl$PeriodicLivelinessScanner.queryLiveness(ActivityManagementMBeanImpl.java:497)
at org.mobicents.slee.container.management.jmx.ActivityManagementMBeanImpl$PeriodicLivelinessScanner.run(ActivityManagementMBeanImpl.java:521)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
We need to understand a reason and provide a workaroud for SLEE does not throw an Exception and process ot more accurate.