vmware-archive / photon-controller

Photon Controller
Other
27 stars 4 forks source link

com.vmware.photon.controller.core.Main: Failed to get token: Failed to acquire tokens. Exception msg: Token is expired.. #127

Closed tactical-drone closed 7 years ago

tactical-drone commented 7 years ago

Hi.

I had some issues with my tenant quotas, so I deleted the tenant. But then because I made a new tenant with the same name the old tenant quota's somehow got sucked into the new config. I then tried restarting the photon-contoller system (the controllers & lw and the loadbalancer) but now the controllers wont start anymore.

All they say is this:

systemctl restart photon-controller; journalctl -f -u photon-controller
-- Logs begin at Fri 2017-04-14 08:20:19 UTC. --
May 27 16:06:33 pc-1 run.sh[699]: Caused by: com.vmware.identity.openidconnect.client.TokenValidationException: Token is expired.
May 27 16:06:33 pc-1 run.sh[699]: at com.vmware.identity.openidconnect.client.IDToken.build(IDToken.java:93)
May 27 16:06:33 pc-1 run.sh[699]: at com.vmware.identity.openidconnect.client.OIDCClientUtils.parseTokenResponse(OIDCClientUtils.java:298)
May 27 16:06:33 pc-1 run.sh[699]: at com.vmware.identity.openidconnect.client.OIDCClient.acquireTokens(OIDCClient.java:180)
May 27 16:06:33 pc-1 run.sh[699]: at com.vmware.identity.openidconnect.client.OIDCClient.acquireTokensBySolutionUserCredentials(OIDCClient.java:326)
May 27 16:06:33 pc-1 run.sh[699]: at com.vmware.photon.controller.common.auth.AuthTokenHandler.getAuthTokensByServiceAccount(AuthTokenHandler.java:117)
May 27 16:06:33 pc-1 run.sh[699]: ... 3 more
May 29 09:21:29 pc-1 systemd[1]: Stopped Photon Controller Configuration Service.
May 29 09:21:29 pc-1 systemd[1]: Stopping Photon Controller Configuration Service...
May 29 09:21:29 pc-1 systemd[1]: Started Photon Controller Configuration Service.
May 29 09:21:31 pc-1 run.sh[4081]: INFO  [2017-05-29 09:21:31,231] com.vmware.photon.controller.core.Main: Creating and storing HOK token in TokenManager
May 29 09:21:31 pc-1 run.sh[4081]: INFO  [2017-05-29 09:21:31,500] com.vmware.photon.controller.core.Main: Failed to get token: Failed to acquire tokens. Exception msg: Token is expired.. Exception Stacktrace: [Ljava.lang.StackTraceElement;@5bd82fed. Stacktrace: [Ljava.lang.StackTraceElement;@c1bd0be
May 29 09:21:31 pc-1 run.sh[4081]: Exception in thread "main" com.vmware.photon.controller.common.auth.AuthException: com.vmware.photon.controller.common.auth.AuthException: Failed to acquire tokens. Exception msg: Token is expired.. Exception Stacktrace: [Ljava.lang.StackTraceElement;@5bd82fed
May 29 09:21:31 pc-1 run.sh[4081]: at com.vmware.photon.controller.core.Main.createAndStoreHOKTokenForServiceAccount(Main.java:260)
May 29 09:21:31 pc-1 run.sh[4081]: at com.vmware.photon.controller.core.Main.main(Main.java:183)
May 29 09:21:31 pc-1 run.sh[4081]: Caused by: com.vmware.photon.controller.common.auth.AuthException: Failed to acquire tokens. Exception msg: Token is expired.. Exception Stacktrace: [Ljava.lang.StackTraceElement;@5bd82fed
May 29 09:21:31 pc-1 run.sh[4081]: at com.vmware.photon.controller.common.auth.AuthTokenHandler.getAuthTokensByServiceAccount(AuthTokenHandler.java:119)
May 29 09:21:31 pc-1 run.sh[4081]: at com.vmware.photon.controller.common.auth.TokenManager.createHOKTokenForServiceAccount(TokenManager.java:111)
May 29 09:21:31 pc-1 run.sh[4081]: at com.vmware.photon.controller.core.Main.createAndStoreHOKTokenForServiceAccount(Main.java:255)
May 29 09:21:31 pc-1 run.sh[4081]: ... 1 more
May 29 09:21:31 pc-1 run.sh[4081]: Caused by: com.vmware.identity.openidconnect.client.TokenValidationException: Token is expired.
May 29 09:21:31 pc-1 run.sh[4081]: at com.vmware.identity.openidconnect.client.IDToken.build(IDToken.java:93)
May 29 09:21:31 pc-1 run.sh[4081]: at com.vmware.identity.openidconnect.client.OIDCClientUtils.parseTokenResponse(OIDCClientUtils.java:298)
May 29 09:21:31 pc-1 run.sh[4081]: at com.vmware.identity.openidconnect.client.OIDCClient.acquireTokens(OIDCClient.java:180)
May 29 09:21:31 pc-1 run.sh[4081]: at com.vmware.identity.openidconnect.client.OIDCClient.acquireTokensBySolutionUserCredentials(OIDCClient.java:326)
May 29 09:21:31 pc-1 run.sh[4081]: at com.vmware.photon.controller.common.auth.AuthTokenHandler.getAuthTokensByServiceAccount(AuthTokenHandler.java:117)
May 29 09:21:31 pc-1 run.sh[4081]: ... 3 more

I did go into ligthtwave at some stage to increase that token lifetime so that the UI does not bug out every 5 seconds, but maybe now after 10 days it actually expired and now photon is confused about updating these tokens? I donno. The fact that the original value was so low also indicates that someone might have hacked in that low value because large (read sensible) values don't work?

Restarting everything does not work and trying to check lightwave logs for errors is pointless because the entire lightwave implementation is a bug according to the logs... you don't know which bug is your bug....

This is what my lightwave logs look like:

May 29 09:28:51 lw-1 vmafdd[603]: t@140603866212096: [Error - 21, ../../../server/vmafd/superlogging.c:535]
May 29 09:28:51 lw-1 vmafdd[603]: t@140603950716672: [Error - 2, ../../../server/vmafd/vecsserviceapi.c:1247]
May 29 09:28:51 lw-1 vmafdd[603]: t@140603950716672: [Error - 2, ../../../server/vmafd/vecsserviceapi.c:1610]
May 29 09:28:51 lw-1 vmafdd[603]: t@140603950716672: [Error - 2, ../../../server/vmafd/rootfetch.c:272]
May 29 09:28:52 lw-1 vmdnsd[726]: t@140120044840704: VmDnsRpcQueryRecords failed. Error(1168)
May 29 09:29:00 lw-1 vmdnsd[726]: t@140120044840704: VmDnsRpcQueryRecords failed. Error(1168)
May 29 09:29:06 lw-1 vmdnsd[726]: t@140120044840704: VmDnsRpcQueryRecords failed. Error(1168)
May 29 09:29:21 lw-1 vmafdd[603]: t@140603866212096: [Error - 21, ../../../server/vmafd/superlogging.c:535]
May 29 09:29:22 lw-1 vmdnsd[726]: t@140120044840704: VmDnsRpcQueryRecords failed. Error(1168)
May 29 09:29:30 lw-1 vmdnsd[726]: t@140120044840704: VmDnsRpcQueryRecords failed. Error(1168)
May 29 09:29:36 lw-1 vmdnsd[726]: t@140120044840704: VmDnsRpcQueryRecords failed. Error(1168)
May 29 09:29:51 lw-1 vmafdd[603]: t@140603866212096: [Error - 21, ../../../server/vmafd/superlogging.c:535]
May 29 09:29:51 lw-1 vmafdd[603]: t@140603950716672: [Error - 2, ../../../server/vmafd/vecsserviceapi.c:1247]
May 29 09:29:51 lw-1 vmafdd[603]: t@140603950716672: [Error - 2, ../../../server/vmafd/vecsserviceapi.c:1610]
May 29 09:29:51 lw-1 vmafdd[603]: t@140603950716672: [Error - 2, ../../../server/vmafd/rootfetch.c:272]
May 29 09:29:52 lw-1 vmdnsd[726]: t@140120044840704: VmDnsRpcQueryRecords failed. Error(1168)
May 29 09:30:00 lw-1 vmdnsd[726]: t@140120044840704: VmDnsRpcQueryRecords failed. Error(1168)
May 29 09:30:06 lw-1 vmdnsd[726]: t@140120044840704: VmDnsRpcQueryRecords failed. Error(1168)
May 29 09:30:21 lw-1 vmafdd[603]: t@140603866212096: [Error - 21, ../../../server/vmafd/superlogging.c:535]
May 29 09:30:22 lw-1 vmdnsd[726]: t@140120044840704: VmDnsRpcQueryRecords failed. Error(1168)
May 29 09:30:30 lw-1 vmdnsd[726]: t@140120044840704: VmDnsRpcQueryRecords failed. Error(1168)
May 29 09:30:36 lw-1 vmdnsd[726]: t@140120044840704: VmDnsRpcQueryRecords failed. Error(1168)

Just one giant bug. Even on fresh install lightwave just produces bug logs all the time. Really uncool.

tactical-drone commented 7 years ago

Maybe this has something to do with it.

image

image

tactical-drone commented 7 years ago

I have two choices when I encounter something like this in development. Just delete everything and start over or investigate.

My managers would love that I take option 1 all the time. But I know. One day my production will suddenly brick on me like this and then I would have no idea where to look or how to fix it.

This is why I always exercise option 2. I am not a very popular guy these days.

tactical-drone commented 7 years ago

Turns out all 4 ESXi servers decided to out of the blue disable my NTP config.

This kind of makes the error obvious.