spring-projects / spring-boot-data-geode

Spring Boot support for Apache Geode and VMware GemFire
https://projects.spring.io/spring-boot
Apache License 2.0
45 stars 49 forks source link

GemFireSecurityException: Error: Anonymous User on running application on PCF #17

Closed trayosam closed 5 years ago

trayosam commented 5 years ago

We received an exception on a running application (i.e. no deployment or restart in between). It seems like somehow the PCC binding got messed up or dropped. After restarting the server it got fixed. Can someone please check what could possibly the issue so that we can avoid this in future on production? Below is stack trace from the log:

2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:478) [tomcat-embed-core-8.5.23.jar!/:8.5.23]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140) [tomcat-embed-core-8.5.23.jar!/:8.5.23]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81) [tomcat-embed-core-8.5.23.jar!/:8.5.23]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87) [tomcat-embed-core-8.5.23.jar!/:8.5.23]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:677) [tomcat-embed-core-8.5.23.jar!/:8.5.23]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342) [tomcat-embed-core-8.5.23.jar!/:8.5.23]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:803) [tomcat-embed-core-8.5.23.jar!/:8.5.23]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) [tomcat-embed-core-8.5.23.jar!/:8.5.23]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868) [tomcat-embed-core-8.5.23.jar!/:8.5.23]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1459) [tomcat-embed-core-8.5.23.jar!/:8.5.23]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) [tomcat-embed-core-8.5.23.jar!/:8.5.23]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_192]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_192]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) [tomcat-embed-core-8.5.23.jar!/:8.5.23]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_192]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] Caused by: org.apache.geode.security.GemFireSecurityException: Error: Anonymous User
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.geode.internal.security.IntegratedSecurityService.getSubject(IntegratedSecurityService.java:117) ~[geode-core-1.0.0-incubating.jar!/:na]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.geode.internal.security.IntegratedSecurityService.authorize(IntegratedSecurityService.java:231) ~[geode-core-1.0.0-incubating.jar!/:na]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.geode.internal.security.IntegratedSecurityService.authorize(IntegratedSecurityService.java:219) ~[geode-core-1.0.0-incubating.jar!/:na]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.geode.internal.security.IntegratedSecurityService.authorize(IntegratedSecurityService.java:209) ~[geode-core-1.0.0-incubating.jar!/:na]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.geode.internal.cache.tier.sockets.command.KeySet.cmdExecute(KeySet.java:99) ~[geode-core-1.0.0-incubating.jar!/:na]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:162) ~[geode-core-1.0.0-incubating.jar!/:na]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMsg(ServerConnection.java:785) ~[geode-core-1.0.0-incubating.jar!/:na]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.geode.internal.cache.tier.sockets.LegacyServerConnection.doOneMessage(LegacyServerConnection.java:85) ~[na:na]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1166) ~[geode-core-1.0.0-incubating.jar!/:na]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_192]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_192]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$1$1.run(AcceptorImpl.java:557) ~[geode-core-1.0.0-incubating.jar!/:na]
2018-11-28T16:28:53.424-05:00 [APP/PROC/WEB/0] [OUT] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_192]
jxblum commented 5 years ago

Hi @trayosam -

First, my apologies for the delay in getting back to you and sorry to hear of your troubles.

That is a bizarre Stack Trace/error message. Was this logged in your Spring [Boot], GemFire client, Web application or did you pull this Stack Trace from the server logs?

The reason I ask is, the cause by part of the Stack Trace is definitely from the server but the upper part seems to have originated from the Web app, during an HTTP request. It is possible for the client to see the Stack Trace of the server in its logs exactly as it occurred on the server(s) but you will typically see a ServerOperationException somewhere in the stack. I suspect this is the case.

Also, what do you mean by "restarting the server"? Do you mean the PCC cluster of servers or your Web app?

If the former, then I would say the problem is "server-side" and the PCC cluster itself had some intermittent issue, and why a PCC cluster/servers restart resolved the issue.

However, if the later, then it would seem the client (Web app) lost its credentials somehow during the authentication of a cache operation, and in this case, it appears to be a KeySet cache operation.

The problem originating from the client is also more consistent with the fact that the server seemingly received an "Anonymous" user (??), meaning the auth credentials were not sent, a bug in the GemFire cache client (perhaps)??

At any rate, for clarification, could you please provide answers to my questions above?

Also, if you could please share:

1) Version of SBDG being used (e.g. 1.0.0.M3) in your Webapp? 2) Version of PCC being used? 3) (Spring) configuration of your Spring Boot, GemFire cache client Web app. 4) Log files for both the client and server. 5) Description of the specific action (e.g. HTTP request) and associated cache operation your Web app was performing when this GemFireSecurityException is thrown. 6) The frequency of this exception. 7) If possible, an example/test that reproduces the problem. 8) Or alternatively, and if possible, a link to a (GitHub) source repository containing the source for your Web app.

Anything you can provide us to help you is greatly appreciated.

Thanks, @jxblum

jxblum commented 5 years ago

Just a quick update... I pinged the Pivotal GemFire/PCC team internally, and I am told this is (possibly) a known issue in the GemFire cache client.

I am having them provide me the JIRA ticket number for either Pivotal GemFire or Apache Geode (JIRA ticket system here).

I will report back when I know more.

jxblum commented 5 years ago

Hi @trayosam -

So, I tracked this down and found out that this is a known issue in the Pivotal GemFire (and by extension, Apache Geode) cache client.

The corresponding Apache Geode JIRA tickets are GEODE-4076 and GEODE-4270.

The GEODE JIRA tickets were resolved in Apache Geode 1.4.0, which means that Pivotal GemFire would have picked up these fixes in 9.3, which subsequently means that PCC version 1.4.1, will contain the fixes you need.

Hope this helps!

Regards, @jxblum