apache / openwhisk

Apache OpenWhisk is an open source serverless cloud platform
https://openwhisk.apache.org/
Apache License 2.0
6.54k stars 1.17k forks source link

Undocumented Container-Reuse #3516

Closed pneuschwander closed 6 years ago

pneuschwander commented 6 years ago

Noticed a "bug" in the documentation:

https://github.com/apache/incubator-openwhisk/blob/master/docs/about.md

There is no word about container reuse or "warm" containers at all.

Instead:

markusthoemmes commented 6 years ago

You're right. FWIW, this bit of documentation is modeled after https://medium.com/openwhisk/uncovering-the-magic-how-serverless-platforms-really-work-3cb127b05f71. The bits about warm/pre-warm/cold are in https://medium.com/openwhisk/squeezing-the-milliseconds-how-to-make-serverless-platforms-blazing-fast-aea0e9951bd0

pneuschwander commented 6 years ago

When using a Java Action, it seems that the container is immediately destroyed as documented :-( Spotted for a short moment:

56367d8597aa whisk/java8action:latest "java -jar /javaActi…" Less than a second ago Up Less than a second wsk00_82_guest_helloJava

docker ps shows prewarm containers only for nodejs...

9e896a93b792 whisk/nodejs6action:latest "/bin/sh -c 'node --…" 38 minutes ago Up 38 minutes wsk00_35_prewarm_nodejs6 35c6ea6de1a5 whisk/nodejs6action:latest "/bin/sh -c 'node --…" 22 hours ago Up 22 hours wsk00_34_prewarm_nodejs6

Is "the cool stuff" not available for Java?

rabbah commented 6 years ago

All actions are treated the same way wrt their life cycle (create, init, run, pause/resume, delete) as that logic is action kind agnostic.

The only distinction that exists today is which containers are allocated as stem cells by default and that’s node for now.

If an action is destroyed after a run, it’s because it error’ed out or the docker command to pause it failed. Check your system logs for the invoker for errors there might be a clue there.

pneuschwander commented 6 years ago

My setup: Vagrant on Windows (https://github.com/apache/incubator-openwhisk/#quick-start)

Sample Action helloJava (https://github.com/apache/incubator-openwhisk/blob/master/docs/actions.md#creating-java-actions)

@rabbah Can you tell me where to find that log?

btw: When I execute docker ps in the VM (accessed via vagrant ssh) the result is "Cannot connect to the Docker daemon at tcp://0.0.0.0:4243. Is the docker daemon running?" In order to work properly, I need to prepend sudo. Could that be related?

chetanmeh commented 6 years ago

@regmebaby sudo should not be required. Can you check the version of ubuntu box being used. May be you are seeing issue seen in #3465 ?

vagrant box list | grep xenial

pneuschwander commented 6 years ago

@chetanmeh

ubuntu/xenial64 (virtualbox, 20180323.0.0)

chetanmeh commented 6 years ago

I got confused - Checked again on my vagrant box and sudo would be required. So that part of your setup looks ok

rabbah commented 6 years ago

wskadmin syslog get invoker0 should do to retrieve the logs - if that doesn't work you can inspect /tmp/wsklogs/invoker0/*.log.

pneuschwander commented 6 years ago

@rabbah

[2018-04-04T16:34:28.051Z] [INFO] [#tid_7] [InvokerReactive] [marker:invoker_activation_start:8] [2018-04-04T16:34:28.054Z] [INFO] [#tid_7] [CouchDbRestStore] [ATT_GET] 'whisk_local_whisks' finding attachment 'jarfile' of document 'id: guest/helloJava, rev: 5-2f61defb9b6bc4a4887e99ebed780d52' [marker:database_getDocumentAttachment_start:11] [2018-04-04T16:34:28.054Z] [INFO] [#tid_7] [WhiskAction] [GET] serving from cache: CacheKey(guest/helloJava) [marker:database_cacheHit_count:11] [2018-04-04T16:34:28.088Z] [INFO] [#tid_7] [CouchDbRestStore] [marker:database_getDocumentAttachment_finish:45:34] [2018-04-04T16:34:28.089Z] [INFO] [#tid_7] [ContainerPool] containerStart containerState: cold action: helloJava namespace: guest activationId: f6919b73a3774265919b73a377d26509 [marker:invoker_containerStart.cold_count:47] [2018-04-04T16:34:28.090Z] [INFO] [#tid_7] [DockerClientWithFileAccess] running /usr/bin/docker run -d --cpu-shares 2 --memory 256m --memory-swap 256m --network bridge -e __OW_API_HOST=https://172.17.0.1:443 --name wsk00_8_guest_helloJava --cap-drop NET_RAW --cap-drop NET_ADMIN --ulimit nofile=1024:1024 --pids-limit 1024 --log-driver json-file whisk/java8action:latest (timeout: 1 minute) [marker:invoker_docker.run_start:48] [2018-04-04T16:34:28.455Z] [INFO] [#tid_7] [DockerClientWithFileAccess] [marker:invoker_docker.run_finish:413:365] [2018-04-04T16:34:28.456Z] [INFO] [#tid_7] [DockerContainer] sending initialization to ContainerId(e244e5ddf0900839a0a1302c9ca5d189fbc548469aa45d0cd851abf064ff127a) ContainerAddress(172.17.0.12,8080) [marker:invoker_activationInit_start:413] [2018-04-04T16:34:28.771Z] [INFO] [#tid_7] [DockerContainer] initialization result: ok [marker:invoker_activationInit_finish:728:315] [2018-04-04T16:34:28.771Z] [INFO] [#tid_7] [DockerContainer] sending arguments to /guest/helloJava at ContainerId(e244e5ddf0900839a0a1302c9ca5d189fbc548469aa45d0cd851abf064ff127a) ContainerAddress(172.17.0.12,8080) [marker:invoker_activationRun_start:729] [2018-04-04T16:34:28.798Z] [INFO] [#tid_7] [DockerContainer] running result: ok [marker:invoker_activationRun_finish:756:27] [2018-04-04T16:34:28.799Z] [INFO] [#tid_7] [ContainerProxy] [marker:invoker_collectLogs_start:756] [2018-04-04T16:34:28.801Z] [INFO] [#sid_140] [CouchDbRestStore] 'whisk_local_activations' saving 1 documents [marker:database_saveDocumentBulk_start:5803047] [2018-04-04T16:34:28.801Z] [INFO] [#tid_7] [ContainerProxy] [marker:invoker_collectLogs_finish:759:3] [2018-04-04T16:34:28.801Z] [INFO] [#tid_7] [CouchDbRestStore] [PUT] 'whisk_local_activations' saving document: 'id: guest/f6919b73a3774265919b73a377d26509, rev: null' [marker:database_saveDocument_start:759] [2018-04-04T16:34:28.805Z] [INFO] [#tid_7] [InvokerReactive] posted completion of activation f6919b73a3774265919b73a377d26509 [2018-04-04T16:34:28.823Z] [INFO] [#sid_140] [CouchDbRestStore] [marker:database_saveDocumentBulk_finish:5803068:21] [2018-04-04T16:34:28.823Z] [INFO] [#tid_7] [CouchDbRestStore] [marker:database_saveDocument_finish:781:22] [2018-04-04T16:34:28.864Z] [INFO] [#sid_102] [RuncClient] running /usr/bin/docker-runc pause e244e5ddf0900839a0a1302c9ca5d189fbc548469aa45d0cd851abf064ff127a (timeout: 10 seconds) [marker:invoker_runc.pause_start:5803112] [2018-04-04T16:34:28.871Z] [ERROR] [#sid_102] [RuncClient] code: 1 , stdout: , stderr: open /run/runc/e244e5ddf0900839a0a1302c9ca5d189fbc548469aa45d0cd851abf064ff127a/state.json: no such file or directory [marker:invoker_runc.pause_error:5803120:8] [2018-04-04T16:34:28.872Z] [INFO] [#sid_102] [DockerClientWithFileAccess] running /usr/bin/docker rm -f e244e5ddf0900839a0a1302c9ca5d189fbc548469aa45d0cd851abf064ff127a (timeout: 1 minute) [marker:invoker_docker.rm_start:5803121] [2018-04-04T16:34:29.182Z] [INFO] [#sid_102] [DockerClientWithFileAccess] [marker:invoker_docker.rm_finish:5803431:310]

Maybe a permission problem?

pneuschwander commented 6 years ago

Invoker tries to access files in /run/runc/ but the files are in /run/docker/runtime-runc/moby/. Any suggestions how to fix this?

chetanmeh commented 6 years ago

Have a look at runc support for steps on disabling its usage.

Currently the mapping to /run/runc is mapped here. So you would need to modify that for it to be picked up from /run/docker/runtime-runc/moby/. I am trying with more updated version of ubuntu vagrant box to see locally

rabbah commented 6 years ago

I vaguely recall a pr to make the path configurable. I’ll try to find it

pneuschwander commented 6 years ago

Why does this happen? Is your most recent master-branch state a broken one?

rabbah commented 6 years ago

It’s likely an issue with Vagrant for which there is no continuous testing and there’s likely a missed docker version pinning.

chetanmeh commented 6 years ago

Not sure. I tried again with latest box

$ vagrant box list | grep xenial
ubuntu/xenial64   (virtualbox, 20180404.0.0)

And the vagrant flow worked fine with /run/runc found. Also it worked with bit older box 20180323.0.0.

pneuschwander commented 6 years ago

@chetanmeh Which Box Version has your VM? Did you create a new/fresh one?

Note that updating the box will not update an already-running Vagrant machine. To reflect the changes in the box, you will have to destroy and bring back up the Vagrant machine. (https://www.vagrantup.com/docs/cli/box.html)

chetanmeh commented 6 years ago

Its a freshly created 20180404.0.0. Some more details - I just checked the echo flow which worked. May be the problem shows up in java flow (which you are trying). Would try a proper java action and see if it works or not

pneuschwander commented 6 years ago

The action I tried is the official example (https://github.com/apache/incubator-openwhisk/blob/master/docs/actions.md#creating-java-actions), shouldn't that be proper?

@rabbah docker -v states Docker version 18.03.0-ce, build 0520e24 docker-runc -v states:

runc version 1.0.0-rc5
commit: 4fc53a81fb7c994640722ac585fa9ca548971871
spec: 1.0.0
rabbah commented 6 years ago

docker 18! That’s surely the culprit since the code is tested primary with docker 12. Carlos should we pin in Vagrant?

chetanmeh commented 6 years ago

Carlos should we pin in Vagrant?

I believe this was done with #3261. Probably with switch to ubuntu xenial (16.04) in #3358 the package name has changed.

ytang commented 6 years ago

I have exactly the same issue. Even if I use docker run -v /run/docker/runtime-runc/moby:/run/runc as suggested by @chetanmeh, it fails with another error: [RuncClient] code: 1 , stdout: , stderr: json: cannot unmarshal number into Go value of type string.

If I downgrade docker-ce from 18.03.0~ce-0~ubuntu to 17.03.0~ce-0~ubuntu-xenial, it works fine.

sechunOH commented 6 years ago

@ytang I also had same issue. This is a problem with unmatched runc version between your invoker host and your invoker. You can see that default runc version in your invoker is matched with docker 1.12(? maybe). But in your invoker host, maybe have different runc version.

Shortly, command "docker-runc --version" is different betweenn invoker host and invoker. This may cause the error when your invoker pause or resume your containers that created by invoker host's docker runc version. (because of invoker container option "-v /var/run/docker.sock:/var/run/docker.sock")

mcdan commented 6 years ago

@rabbah Probably not what you had in mind but basically I just ripped it out of the compose file: https://github.com/apache/incubator-openwhisk-devtools/commit/92b19a565608d7534a7acac8b14849ff86cff198#diff-cb44929492b826b5b16656db6a5a26a4 As it wasn't available in any of the supported targets.

rabbah commented 6 years ago

Docs improved to mention reuse and warm containers.

https://github.com/apache/incubator-openwhisk/blob/master/docs/actions.md#action-execution