Open PeriGK opened 4 years ago
Hi thanks for your interest
Unfortunately unless you fill out the issue template including "Steps to Reproduce", then we're unlikely to be able to help.
Please could you do that?
Thanks
Hi @alexellis sorry about that, I forgot. I fixed it now
More details:
I spotted that for those containers that fit the problem have the following error in docker service ps service_name
. Check the Error column
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
3omhdi90dsvg wordcount.1 functions/alpine:latest moro-dell Shutdown Failed 2 months ago "No such container: wordcount.…"
obee7cwi3ffa \_ wordcount.1 functions/alpine:latest moro-dell Shutdown Failed 2 months ago "No such container: wordcount.…"
nf1i6pwoct3v \_ wordcount.1 functions/alpine:latest moro-dell Shutdown Failed 2 months ago "No such container: wordcount.…"
sai0wpxjx0z3 \_ wordcount.1 functions/alpine:latest moro-dell Shutdown Failed 2 months ago "No such container: wordcount.…"
j9c0yzil6idv \_ wordcount.1 functions/alpine:latest moro-dell Shutdown Failed 2 months ago "No such container: wordcount.…"
opfuml973pxs \_ wordcount.1 functions/alpine:latest moro-dell Shutdown Failed 2 months ago "No such container: wordcount.…"
Hi @alexellis
I did some more investigation. I managed to reproduce it with a function which was working in the afternoon but not in the morning. Looks like the swarm managed couldn't recover the container after shutting down my machine.
Not sure if you have any input on that, but I don't see any other explanation.
Thanks, P.
My actions before raising this issue
Hi,
I have written a few functions in openfaas. Some of those are not used very frequently. This morning I tried to send a request to a function that was not touched (HTTP request/build/deploy) for a few weeks.
The function never brought up.
Some facts that came from my investigation:
The
docker service ps
command returns a shutdown state.docker service inspect
returns a MaxAttempts of 5 in the RestartPolicy, which might be related or not.In the meantime, as we are speaking about local environments, I have shut down my machine every night, which I suppose is affecting the issue one way or another.
Are any of those related? What about the read_timeout/write_timeout settings?
Expected Behaviour
The function to recover as a reaction to the invoke/http request, of course with some expected delay.
Of course, it is all going back to normal if I do a build-deploy again from the faas-cli (new function with the same contents like the old one). But of course I would like this to happen without any manual intervention.
Current Behaviour
The function is not recovering from down state.
Possible Solution
Steps to Reproduce (for bugs)
return {"hello": "world"}
would suffice.Context
I understand this is a common concern, so I don't think this is a bug, rather a lack of my understanding or documentation.
So my questions are:
Your Environment
faas-cli version
):CLI: commit: 73004c23e5a4d3fdb7352f953247473477477a64 version: 0.11.3
Gateway uri: http://127.0.0.1:8080 version: 0.18.10 sha: 80b6976c106370a7081b2f8e9099a6ea9638e1f3 commit: Update Golang versions to 1.12
Provider name: faas-swarm orchestration: swarm version: 0.8.2 sha: 47988f8ba284678f3eb86eb62f75f72bafeec4d9 Your faas-cli version (0.11.3) may be out of date. Version: 0.12.2 is now available on GitHub.
Are you using Docker Swarm or Kubernetes (FaaS-netes)? Docker Swarm
Operating System and version (e.g. Linux, Windows, MacOS): Linux
Code example or link to GitHub repo or gist to reproduce problem: N/A
Other diagnostic information / logs from troubleshooting guide The service shows no logs.
Thanks, P.