renoki-co / php-k8s

Unofficial PHP client for Kubernetes. It supports any form of authentication, the exec API, and it has an easy implementation for CRDs.
Apache License 2.0
307 stars 56 forks source link

Exception when automounted token is updated by k8s (?): "Failed to open stream: No such file or directory" #398

Open dev-maniac opened 9 months ago

dev-maniac commented 9 months ago

Since k8s 1.22 automounted service account tokens have a limited lifespan and will be updated by k8s when they expire.

It seems that there can be a short timespan where access token is not accessible. Got following exception at Dec 8, 2023 13:36 UTC:

file_get_contents(/var/run/secrets/kubernetes.io/serviceaccount/token): Failed to open stream: No such file or directory

Directory in pod (times also in UTC):

drwxr-xr-x 2 root root  100 Dec  8 13:35 ..2023_12_08_13_35_55.230509253/
lrwxrwxrwx 1 root root   31 Dec  8 13:35 ..data -> ..2023_12_08_13_35_55.230509253/
lrwxrwxrwx 1 root root   13 Dec  8 12:47 ca.crt -> ..data/ca.crt
lrwxrwxrwx 1 root root   16 Dec  8 12:47 namespace -> ..data/namespace
lrwxrwxrwx 1 root root   12 Dec  8 12:47 token -> ..data/token

So it seems like it's really related...

I did not find anything related to this issue on the net, yet. Not entirely sure, if I should report this somewhere else.

Proposed, simple fix would be to try reading file two or three times with a short sleep inbetween.

fieteboerner commented 5 months ago

I encountered the exact same issue. I could narrow it down to the realpath cache of php. Because the token file is a symlink to a directory of the latest token and kubernetes is creating a new directory if the token get renewed. So PHP is referencing to the old event if the old token does not longer exists. This leads us to a kind of misleading error message.

So even sleeping for some seconds wouldn't help in this case.

luckily php has a function to clear the entire realpath cache or just a single entry. (clearstatcache(true) or clearstatcache(true, '/path/to/symlink'))

So what i am doing now is to clear this one path of the token symlink every time i am connecting to the cluster. And now it works without the annoying error:


clearstatcache(true, '/var/run/secrets/kubernetes.io/serviceaccount/token');

KubernetesCluster::inClusterConfiguration(config('k8s.cluster.apiUri'));

This is fixing the error, but it would be nice if this would be done in the library itself.

fieteboerner commented 3 months ago

After a long term observation I still have encountered this issue a few times a week, even with this line of code:

clearstatcache(true, '/var/run/secrets/kubernetes.io/serviceaccount/token');

But after replacing the explicit cache clear call, with:

clearstatcache(true);

every time before we initialize the this library, it works since 3-4 weeks without a single Error. It is not the cleanest approach, but if this isn't called every request of your webserver, it could be a good workaround.