Open amitbatajoo opened 1 year ago
Try to figure out the reason in the controller logs.
If you used the default configuration, it would be under the /tmp/wsklogs/controller
directory.
@style95 Thank you for your suggestion but I am unable to find the /tmp/wsklogs/controller directory in my environment. Is there any additional setting or need to open ports for successful installation.
@style95
Now I am getting this error when I execute the commnad: ansible-playbook -i environments/local/ openwhisk.yml
fatal: [kafka0]: FAILED! => {"attempts": 10, "changed": true, "cmd": "(echo dump; sleep 1) | nc 172.17.0.1 2181 | grep /brokers/ids/0", "delta": "0:00:01.017253", "end": "2022-10-16 05:55:31.438724", "msg": "non-zero return code", "rc": 1, "start": "2022-10-16 05:55:30.421471", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
[FAILED]
(echo dump; sleep 1) | nc 172.17.0.1 2181 | grep /brokers/ids/0 non-zero return code
PLAY RECAP ***** etcd0 : ok=0 changed=0 unreachable=0 failed=0 kafka0 : ok=9 changed=3 unreachable=0 failed=1
We can't figure out the reason with what you provided. Please share the container logs or components logs.
You can find logs under this directory or with the docker logs
command.
Hi @style95 , sorry for late reply. Let me explain in simple. In my new machine "Ubuntu 18.04.6 LTS", I am trying to install openwhisk. After successful build, I tried to execute ansible command one by one: ansible-playbook -i environments/local/ couchdb.yml ansible-playbook -i environments/local/ initdb.yml ansible-playbook -i environments/local/ wipe.yml ansible-playbook -i environments/local/ apigateway.yml
And encounter the error when I run this command
root@ubuntu-s-1vcpu-2gb-sgp1-01:~/openwhisk/ansible# ansible-playbook -i environments/local/ openwhisk.yml
fatal: [172.17.0.1]: FAILED! => {"changed": false, "msg": "The header parameter requires a key:value,key:value syntax to be properly parsed."}
The header parameter requires a key:value,key:value syntax to be properly parsed.
PLAY RECAP ***** 172.17.0.1 : ok=11 changed=9 unreachable=0 failed=1
Please share the container logs or components logs.
Here is a log:
root@ubuntu-s-1vcpu-2gb-sgp1-01:~/openwhisk/tmp/wsklogs# cat controller0/controller0_logs.log [2022-10-17T23:10:35.980Z] [INFO] Slf4jLogger started [2022-10-17T23:10:37.158Z] [INFO] Remoting started with transport [Artery tcp]; listening on address [akka://controller-actor-system@172.17.0.7:25520] with UID [9046092925284476410] [2022-10-17T23:10:37.227Z] [INFO] Cluster Node [akka://controller-actor-system@172.17.0.7:25520] - Starting up, Akka version [2.6.12] ... [2022-10-17T23:10:37.520Z] [INFO] Cluster Node [akka://controller-actor-system@172.17.0.7:25520] - Registered cluster JMX MBean [akka:type=Cluster] [2022-10-17T23:10:37.522Z] [INFO] Cluster Node [akka://controller-actor-system@172.17.0.7:25520] - Started up successfully [2022-10-17T23:10:37.784Z] [INFO] Cluster Node [akka://controller-actor-system@172.17.0.7:25520] - No downing-provider-class configured, manual cluster downing required, see https://doc.akka.io/docs/akka/current/typed/cluster.html#downing [2022-10-17T23:10:37.786Z] [INFO] Cluster Node [akka://controller-actor-system@172.17.0.7:25520] - No seed nodes found in configuration, relying on Cluster Bootstrap for joining [2022-10-17T23:10:39.666Z] [WARN] Failed to attach the instrumentation because the Kamon Bundle is not present on the classpath [2022-10-17T23:10:40.076Z] [INFO] Started the Kamon StatsD reporter [2022-10-17T23:10:41.333Z] [INFO] [#tid_sid_unknown] [Config] environment set value for limits.triggers.fires.perMinute [2022-10-17T23:10:41.334Z] [INFO] [#tid_sid_unknown] [Config] environment set value for limits.actions.sequence.maxLength [2022-10-17T23:10:41.334Z] [INFO] [#tid_sid_unknown] [Config] environment set value for limits.actions.invokes.concurrent [2022-10-17T23:10:41.336Z] [INFO] [#tid_sid_unknown] [Config] environment set value for whisk.api.host.name [2022-10-17T23:10:41.336Z] [INFO] [#tid_sid_unknown] [Config] environment set value for limits.actions.invokes.perMinute [2022-10-17T23:10:41.336Z] [INFO] [#tid_sid_unknown] [Config] environment set value for whisk.api.host.proto [2022-10-17T23:10:41.336Z] [INFO] [#tid_sid_unknown] [Config] environment set value for whisk.api.host.port [2022-10-17T23:10:41.337Z] [INFO] [#tid_sid_unknown] [Config] environment set value for runtimes.manifest [2022-10-17T23:10:41.337Z] [INFO] [#tid_sid_unknown] [Config] environment set value for port [2022-10-17T23:10:41.759Z] [INFO] [#tid_sid_unknown] [LeanMessagingProvider] topic completed0 created [2022-10-17T23:10:41.759Z] [INFO] [#tid_sid_unknown] [LeanMessagingProvider] topic health created [2022-10-17T23:10:41.759Z] [INFO] [#tid_sid_unknown] [LeanMessagingProvider] topic cacheInvalidation created [2022-10-17T23:10:41.760Z] [INFO] [#tid_sid_unknown] [LeanMessagingProvider] topic events created [2022-10-17T23:10:42.460Z] [INFO] [#tid_sid_controller] [Controller] starting controller instance 0 [marker:controller_startup0_counter:1190] [2022-10-17T23:10:44.458Z] [INFO] [#tid_sid_dispatcher] [MessageFeed] handler capacity = 128, pipeline fill at = 128, pipeline depth = 256 [2022-10-17T23:10:44.656Z] [INFO] [#tid_sid_dispatcher] [MessageFeed] handler capacity = 128, pipeline fill at = 128, pipeline depth = 256 [2022-10-17T23:10:45.060Z] [INFO] [#tid_sid_unknown] [InvokerReactive] LogStoreProvider: class org.apache.openwhisk.core.containerpool.logging.DockerToActivationLogStore [2022-10-17T23:10:45.933Z] [INFO] [#tid_sid_unknown] [DockerClientWithFileAccess] Detected docker client version 18.06.3-ce [2022-10-17T23:10:46.124Z] [INFO] [#tid_sidinvoker] [DockerClientWithFileAccess] running /usr/bin/docker ps --quiet --no-trunc --all --filter name=wsk0 (timeout: 1 minute) [marker:invoker_docker.ps_start:4858] [2022-10-17T23:10:46.406Z] [INFO] [#tid_sid_invoker] [DockerClientWithFileAccess] [marker:invoker_docker.ps_finish:5140:145] [2022-10-17T23:10:46.412Z] [INFO] [#tid_sid_invoker] [DockerContainerFactory] removing 0 action containers. [2022-10-17T23:10:48.650Z] [INFO] [#tid_sid_invoker] [CouchDbRestStore] [QUERY] 'whisk_local_subjects' searching 'namespaceThrottlings/blockedNamespaces [marker:database_queryView_start:7384] [2022-10-17T23:10:50.224Z] [INFO] [#tid_sid_dispatcher] [MessageFeed] handler capacity = 2000, pipeline fill at = 2000, pipeline depth = 4000 [2022-10-17T23:10:50.325Z] [INFO] [#tid_sid_invoker] [CouchDbRestStore] [marker:database_queryView_finish:9058:1672] [2022-10-17T23:10:50.327Z] [INFO] [#tid_sid_unknown] [InvokerReactive] updated blacklist to 0 entries [2022-10-17T23:10:50.703Z] [INFO] [#tid_sid_invokerWarmup] [ContainerPool] found 0 started and 0 starting; initing 2 pre-warms to desired count: 2 for kind:nodejs:14 mem:256 MB [2022-10-17T23:10:50.727Z] [INFO] [#tid_sid_controller] [Controller] loadbalancer initialized: LeanBalancer [2022-10-17T23:10:51.084Z] [INFO] [#tid_sid_controller] [KindRestrictor] all kinds are allowed, the white-list is not specified [2022-10-17T23:10:51.624Z] [INFO] [#tid_sid_invokerWarmup] [DockerClientWithFileAccess] running /usr/bin/docker run -d --cpu-shares 256 --memory 256m --memory-swap 256m --network bridge -e OW_API_HOST=https://172.17.0.1 -e OW_ALLOW_CONCURRENT=True --name wsk0_1_prewarm_nodejs14 --cap-drop NET_RAW --cap-drop NET_ADMIN --ulimit nofile=1024:1024 --pids-limit 1024 --log-driver json-file openwhisk/action-nodejs-v14:nightly (timeout: 1 minute) [marker:invoker_docker.run_start:10358] [2022-10-17T23:10:51.630Z] [INFO] [#tid_sid_invokerWarmup] [DockerClientWithFileAccess] running /usr/bin/docker run -d --cpu-shares 256 --memory 256m --memory-swap 256m --network bridge -e OW_API_HOST=https://172.17.0.1 -e OW_ALLOW_CONCURRENT=True --name wsk0_2_prewarm_nodejs14 --cap-drop NET_RAW --cap-drop NET_ADMIN --ulimit nofile=1024:1024 --pids-limit 1024 --log-driver json-file openwhisk/action-nodejs-v14:nightly (timeout: 1 minute) [marker:invoker_docker.run_start:10362] [2022-10-17T23:10:53.314Z] [INFO] [#tid_sid_invokerWarmup] [DockerClientWithFileAccess] [marker:invoker_docker.run_finish:12047:1680] [2022-10-17T23:10:53.354Z] [INFO] [#tid_sid_invokerWarmup] [DockerClientWithFileAccess] [marker:invoker_docker.run_finish:12088:1728] [2022-10-17T23:10:56.377Z] [INFO] [#tid_sid_controller] [ActionsApi] actionSequenceLimit '50' [2022-10-17T23:10:57.534Z] [WARN] Binding with a connection source not supported with HTTP/2. Falling back to HTTP/1.1.
Thank you in advance.
You need to provide the step you faced the error. What you provided is just an error log and cannot figure out the step itself. And you said you provided the controller logs but it seems that's an invoker log?
Thank you for the comments, I am not sure about steps you are asking but here are the steps I am doing for setup:
I am following the tutorial from here https://github.com/apache/openwhisk/blob/master/ansible/README.md
(1) apt install git Next, clone the repo to the local directory: (2)git clone https://github.com/apache/openwhisk.git openwhisk (3)cd openwhisk cd openwhisk && cd tools/ubuntu-setup && ./all.sh (4) Next, configure a persistent storage database for OpenWhisk, with CouchDB. export OW_DB=CouchDB export OW_DB_USERNAME=root export OW_DB_PASSWORD=root123 export OW_DB_PROTOCOL=http export OW_DB_HOST=172.17.0.1 export OW_DB_PORT=5984
(5)In the openwhisk/ansible directory, ansible-playbook -i environments/local/ setup.yml
Next, use CouchDB to deploy OpenWhisk and make sure that db_local.ini is available locally.
(6)Execute the deployment command in the openwhisk/ directory: ./gradlew distDocker
(7)Next enter the openwhisk/ansible directory: ansible-playbook -i environments/local/ couchdb.yml ansible-playbook -i environments/local/ initdb.yml ansible-playbook -i environments/local/ wipe.yml ansible-playbook -i environments/local/ apigateway.yml
(8) ansible-playbook -i environments/local/ openwhisk.yml ansible-playbook -i environments/local/ postdeploy.yml
Till 1 to 7, installation and build without any trouble. But at step 8 I faced the error as explained in: https://github.com/apache/openwhisk/issues/5331#issuecomment-1281621529
Hey. Since this is still open, I am having the same issue with setting up the controller when I try to run openwhisk.yml
with ansible playbook. the controller log says that the keystore password is wrong, I assume this password is generated with setup.yml
and then later copied when I run openwhisk.yml
or controller.yml
. Any suggestions to solve this?
TASK [controller : add seed nodes to controller environment] *******************************************************************************
Monday 27 February 2023 12:29:27 -0600 (0:00:00.114) 0:00:28.125 *******
ok: [controller0] => (item=[0, '172.17.0.1'])
TASK [controller : Add akka environment to controller environment] *************************************************************************
Monday 27 February 2023 12:29:27 -0600 (0:00:00.169) 0:00:28.295 *******
ok: [controller0]
TASK [controller : lean controller setup] **************************************************************************************************
Monday 27 February 2023 12:29:27 -0600 (0:00:00.242) 0:00:28.537 *******
skipping: [controller0]
TASK [controller : (re)start controller] ***************************************************************************************************
Monday 27 February 2023 12:29:27 -0600 (0:00:00.038) 0:00:28.576 *******
changed: [controller0]
TASK [controller : wait until the Controller in this host is up and running] ***************************************************************
Monday 27 February 2023 12:29:28 -0600 (0:00:01.088) 0:00:29.664 *******
FAILED - RETRYING: wait until the Controller in this host is up and running (12 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (11 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (10 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (9 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (8 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (7 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (6 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (5 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (4 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (3 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (2 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (1 retries left).
fatal: [controller0]: FAILED! => {"attempts": 12, "changed": false, "elapsed": 0, "msg": "Status code was -1 and not [200]: Request failed: <urlopen error [Errno 111] Connection refused>", "redirected": false, "status": -1, "url": "https://172.17.0.1:10001/ping"}
Status code was -1 and not [200]: Request failed: <urlopen error [Errno
111] Connection refused>
PLAY RECAP *********************************************************************************************************************************
controller0 : ok=25 changed=7 unreachable=0 failed=1 skipped=14 rescued=0 ignored=0
etcd0 : ok=0 changed=0 unreachable=0 failed=0 skipped=7 rescued=0 ignored=0
kafka0 : ok=10 changed=4 unreachable=0 failed=0 skipped=7 rescued=0 ignored=0
Monday 27 February 2023 12:31:33 -0600 (0:02:04.909) 0:02:34.574 *******
===============================================================================
controller : wait until the Controller in this host is up and running ------------------------------------------------------------- 124.91s
kafka : wait until the kafka server started up -------------------------------------------------------------------------------------- 7.49s
zookeeper : (re)start zookeeper ----------------------------------------------------------------------------------------------------- 2.07s
kafka : (re)start kafka using 'wurstmeister/kafka:2.13-2.7.0' ---------------------------------------------------------------------- 1.90s
controller : copy certificates ------------------------------------------------------------------------------------------------------ 1.68s
zookeeper : wait until the Zookeeper in this host is up and running ----------------------------------------------------------------- 1.42s
controller : populate environment variables for controller -------------------------------------------------------------------------- 1.36s
Gathering Facts --------------------------------------------------------------------------------------------------------------------- 1.23s
controller : (re)start controller --------------------------------------------------------------------------------------------------- 1.09s
Gathering Facts --------------------------------------------------------------------------------------------------------------------- 0.92s
controller : check if whisk_local_activations with CouchDB exists ------------------------------------------------------------------- 0.87s
controller : copy nginx certificate keystore ---------------------------------------------------------------------------------------- 0.86s
controller : check if whisk_local_whisks with CouchDB exists ------------------------------------------------------------------------ 0.85s
Gathering Facts --------------------------------------------------------------------------------------------------------------------- 0.83s
controller : copy jmxremote password file ------------------------------------------------------------------------------------------- 0.80s
controller : check if whisk_local_subjects with CouchDB exists ---------------------------------------------------------------------- 0.79s
controller : copy jmxremote access file --------------------------------------------------------------------------------------------- 0.50s
controller : ensure controller config directory is created with permissions --------------------------------------------------------- 0.41s
kafka : create kafka certificate directory ------------------------------------------------------------------------------------------ 0.40s
controller : check, that required databases exist ----------------------------------------------------------------------------------- 0.31s
[2023-02-27T18:29:37.129Z] [INFO] [#tid_sid_controller] [Controller] loadbalancer initialized: ShardingContainerPoolBalancer
[2023-02-27T18:29:37.135Z] [INFO] [#tid_sid_dispatcher] [MessageFeed] handler capacity = 128, pipeline fill at = 128, pipeline depth = 256
[2023-02-27T18:29:37.282Z] [INFO] [#tid_sid_controller] [KindRestrictor] all kinds are allowed, the white-list is not specified
[2023-02-27T18:29:38.217Z] [INFO] [#tid_sid_controller] [ActionsApi] actionSequenceLimit '50'
Exception in thread "main" java.io.IOException: keystore password was incorrect
at java.base/sun.security.pkcs12.PKCS12KeyStore.engineLoad(PKCS12KeyStore.java:2117)
at java.base/sun.security.util.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:222)
at java.base/java.security.KeyStore.load(KeyStore.java:1479)
at org.apache.openwhisk.common.Https$.applyHttpsConfig(Https.scala:58)
at org.apache.openwhisk.common.Https$.connectionContextServer(Https.scala:92)
at org.apache.openwhisk.http.BasicHttpService$.$anonfun$startHttpService$1(BasicHttpService.scala:174)
at org.apache.openwhisk.http.BasicHttpService$$$Lambda$2199/00000000D6E0BEB0.apply(Unknown Source)
at scala.Option.map(Option.scala:230)
at org.apache.openwhisk.http.BasicHttpService$.startHttpService(BasicHttpService.scala:174)
at org.apache.openwhisk.core.controller.Controller$.start(Controller.scala:285)
at org.apache.openwhisk.core.controller.Controller$.main(Controller.scala:233)
at org.apache.openwhisk.core.controller.Controller.main(Controller.scala)
Caused by: java.security.UnrecoverableKeyException: failed to decrypt safe contents entry: javax.crypto.BadPaddingException: Given final block not properly padded. Such issues can arise if a bad key is used during decryption.
Hello @peimanfth
I am facing the exact same issue you've described above. Did you find a fix to this?
Finding a similar issue , did you find a fix? @vishalvrv9 @amitbatajoo @peimanfth
I believe it was an issue with pre-existing openWhisk credentials. I cleaned the openWhisk installation and and deleted the directory outside the openWhisk directory that is associated with openWhisk credentials. I also cleaned the couchdb instance and re-installed everything again, This solved the issue for me. @vishalvrv9 @Dakzh10
I believe it was an issue with pre-existing openWhisk credentials. I cleaned the openWhisk installation and and deleted the directory outside the openWhisk directory that is associated with openWhisk credentials. I also cleaned the couchdb instance and re-installed everything again, This solved the issue for me. @vishalvrv9 @Dakzh10
Hi I am facing the exact same issue. Could you please be more specific about the files you deleted? And by "cleaned the couchDB instance" , do you mean removing the couchDB container
? @peimanfth
I believe it was an issue with pre-existing openWhisk credentials. I cleaned the openWhisk installation and and deleted the directory outside the openWhisk directory that is associated with openWhisk credentials. I also cleaned the couchdb instance and re-installed everything again, This solved the issue for me. @vishalvrv9 @Dakzh10
Hi I am facing the exact same issue. Could you please be more specific about the files you deleted? And by "cleaned the couchDB instance" , do you mean removing the
couchDB container
? @peimanfth
If you are running it on Ubuntu you can find the credentials under /var/tmp/wskconf
there are credentials for each component. For instance, if you only have a single controller in your deployment, its credentials should be under controller/controller0
. if you delete the whole directory it will be regenerated each time you run setup.yml
.
On further note, since you are reproducing rainbowCake, make sure openwhisk can access your CouchDB instance. It could be because you already have a local CouchDB on your machine and also another instance deployed on your docker engine. In that case, I would recommend removing your manually deployed CouchDB and let the yml scripts install CouchDB using docker images.
I did the followings:
docker stop $(docker ps -aq) && docker rm $(docker ps -aq)
docker kill $(docker ps -q)
docker image rm $(docker image ls -aq)
docker volume rm $(docker volume ls -q)
docker network rm $(docker network ls -q)
docker system prune -a --volumes -f
ansible-playbook -i environments/$ENVIRONMENT openwhisk.yml -e mode=clean
ansible-playbook -i environments/$ENVIRONMENT controller.yml -e mode=clean
But still no luck. I think the problem is I have two python versions installed python 3.10.12 and python 3.9.0.I have configured this project to use python 3.9.0. But there are some commands in the setup script that are still using Python 3.10.12
I spawn up a cloud VM having Ubuntu 20.04 LTS with Python 3.8.0 (not Python 3.10) and when i ran the scripts inside this VM, they ran to completion and I didn't face any issue. So it seems like Python version is the main cause.
I am still stuck with the task "wait until the Controller in this host is up and running". i have exhausted all option but no luck. if anyone could help me with this problem, i would highly appreciate this. this is what i get when i run the openwhisk.yml playbook
FYI, i couldn't find the log directory /tmp/wsklogs/controller, i even exported the variable OPENWHISK_TMP_DIR
in bashrc profile
And when i check the container logs, it is empty
I also tried changing the url by modifying protocol (http <=> https) and ansible_host (172.17.0.1, 127.0.0.1, localhost, VM_IP (34.121.94.184))
System Details:
OS: Ubuntu 22.04.4 LTS
Python version : 3.9.0 (tried with other python versions and even set the global python interpreter using the command: echo -e "\nansible_python_interpreter:
which python\n" >> ./environments/local/group_vars/all
pip version: 20.2.3
docker version: 27.3.1
Looking forward to your reply. Thank you for taking time to read this. Please let me know if you need any further details.
Environment details:
Steps to reproduce the issue:
Provide the expected results and outputs:
Unable to
Please help me where and what I am missing in settings?
How can I get success deploy?
Thank you in advance.