Open mluds opened 2 years ago
It's possible that some of the changes in #698 would correct this. As part of that PR, I updated to use the latest openwhisk-package-alarms release and did test locally that the alarms provider was working.
I've just merged the PR. Maybe try again and see if it helped?
Thanks, I updated to the latest code. However, I'm still seeing this EPROTO error. It seems like it's trying to initiate an HTTPS connection using an HTTP URL.
Here are some logs from the alarmprovider pod:
[2021-11-05T16:44:35.882Z] [INFO] [??] [alarmsTrigger] [createDatabase] creating the trigger database
[2021-11-05T16:44:35.907Z] [INFO] [??] [alarmsTrigger] [server.listen] Express server listening on port 8080
[2021-11-05T16:44:36.118Z] [INFO] [??] [alarmsTrigger] [createDatabase] created trigger database: almalarmservice
[2021-11-05T16:44:36.376Z] [INFO] [??] [alarmsTrigger] [initAllTriggers] resetting system from last state
[2021-11-05T16:46:43.804Z] [INFO] [??] [alarmsTrigger] [setupFollow] got change for trigger xxxxxxxx/whisk.system/getvolumes-trigger
[2021-11-05T16:46:43.806Z] [INFO] [??] [alarmsTrigger] [scheduleCronAlarm] xxxxxxxx/whisk.system/getvolumes-trigger starting cron job
[2021-11-05T16:46:43.811Z] [INFO] [??] [alarmsTrigger] [setupFollow] xxxxxxxx/whisk.system/getvolumes-trigger created successfully
[2021-11-05T17:00:00.013Z] [INFO] [??] [alarmsTrigger] [fireTrigger] Alarm fired for xxxxxxxx/whisk.system/getvolumes-trigger attempting to fire trigger
(node:1) Warning: Setting the NODE_TLS_REJECT_UNAUTHORIZED environment variable to '0' makes TLS connections and HTTPS requests insecure by disabling certificate verification.
(Use `node --trace-warnings ...` to show where the warning was created)
[2021-11-05T17:00:00.043Z] [INFO] [??] [alarmsTrigger] [postTrigger] xxxxxxxx/whisk.system/getvolumes-trigger http post request, STATUS:
[2021-11-05T17:00:00.044Z] [ERROR] [??] [alarmsTrigger] [postTrigger] there was an error invoking xxxxxxxx/whisk.system/getvolumes-trigger {"message":"write EPROTO 139717772076864:error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../deps/openssl/openssl/ssl/record/ssl3_record.c:332:\n","stack":"Error: write EPROTO 139717772076864:error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../deps/openssl/openssl/ssl/record/ssl3_record.c:332:\n\n at WriteWrap.onWriteComplete [as oncomplete] (internal/stream_base_commons.js:94:16)","errno":-71,"code":"EPROTO","syscall":"write"}
[2021-11-05T17:00:00.044Z] [INFO] [??] [alarmsTrigger] [postTrigger] attempting to fire trigger again xxxxxxxx/whisk.system/getvolumes-trigger Retry Count: 1
[2021-11-05T17:00:01.050Z] [INFO] [??] [alarmsTrigger] [postTrigger] xxxxxxxx/whisk.system/getvolumes-trigger http post request, STATUS:
[2021-11-05T17:00:01.050Z] [ERROR] [??] [alarmsTrigger] [postTrigger] there was an error invoking xxxxxxxx/whisk.system/getvolumes-trigger {"message":"write EPROTO 139717772076864:error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../deps/openssl/openssl/ssl/record/ssl3_record.c:332:\n","stack":"Error: write EPROTO 139717772076864:error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../deps/openssl/openssl/ssl/record/ssl3_record.c:332:\n\n at WriteWrap.onWriteComplete [as oncomplete] (internal/stream_base_commons.js:94:16)","errno":-71,"code":"EPROTO","syscall":"write"}
[2021-11-05T17:00:01.050Z] [INFO] [??] [alarmsTrigger] [postTrigger] attempting to fire trigger again xxxxxxxx/whisk.system/getvolumes-trigger Retry Count: 2
[2021-11-05T17:00:02.081Z] [INFO] [??] [alarmsTrigger] [postTrigger] xxxxxxxx/whisk.system/getvolumes-trigger http post request, STATUS:
[2021-11-05T17:00:02.082Z] [INFO] [??] [alarmsTrigger] [postTrigger] attempting to fire trigger again xxxxxxxx/whisk.system/getvolumes-trigger Retry Count: 3
[2021-11-05T17:00:02.082Z] [ERROR] [??] [alarmsTrigger] [postTrigger] there was an error invoking xxxxxxxx/whisk.system/getvolumes-trigger {"message":"write EPROTO 139717772076864:error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../deps/openssl/openssl/ssl/record/ssl3_record.c:332:\n","stack":"Error: write EPROTO 139717772076864:error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../deps/openssl/openssl/ssl/record/ssl3_record.c:332:\n\n at WriteWrap.onWriteComplete [as oncomplete] (internal/stream_base_commons.js:94:16)","errno":-71,"code":"EPROTO","syscall":"write"}
I've verified the alarmprovider pod is running 2.3.0
(mirrored to our internal registry) and using node 14:
$ kubectl -n openwhisk-blue describe pod openwhisk-alarmprovider-bdd74df5f-5qm2j | grep "Image:"
Image: docker-registry.somedomain/busybox:latest
Image: docker-registry.somedomain/openwhisk/alarmprovider:2.3.0
root@openwhisk-alarmprovider-bdd74df5f-5qm2j:/# node --version
v14.17.2
Also here's the env
if that helps at all:
root@openwhisk-alarmprovider-bdd74df5f-5qm2j:/# env
OPENWHISK_CONTROLLER_SERVICE_HOST=10.43.45.12
ENDPOINT_AUTH=openwhisk-nginx.openwhisk-blue.svc.cluster.local:80
YARN_VERSION=1.22.5
OPENWHISK_APIGATEWAY_SERVICE_HOST=10.43.158.177
OPENWHISK_CONTROLLER_PORT_8080_TCP_ADDR=10.43.45.12
DB_HOST=couchdb-svc-couchdb.couchdb-blue.svc.cluster.local:5984
OPENWHISK_APIGATEWAY_PORT_9000_TCP_PROTO=tcp
OPENWHISK_REDIS_PORT_6379_TCP_PORT=6379
OPENWHISK_NGINX_PORT_443_TCP_ADDR=10.43.108.19
OPENWHISK_APIGATEWAY_SERVICE_PORT=8080
OPENWHISK_APIGATEWAY_PORT_9000_TCP=tcp://10.43.158.177:9000
OPENWHISK_NGINX_SERVICE_HOST=10.43.108.19
OPENWHISK_APIGATEWAY_PORT_9000_TCP_PORT=9000
HOSTNAME=openwhisk-alarmprovider-bdd74df5f-5qm2j
OPENWHISK_CONTROLLER_PORT_8080_TCP_PROTO=tcp
OPENWHISK_REDIS_SERVICE_PORT=6379
OPENWHISK_APIGATEWAY_PORT_8080_TCP_PROTO=tcp
OPENWHISK_APIGATEWAY_SERVICE_PORT_API=9000
KUBERNETES_PORT_443_TCP_PROTO=tcp
OPENWHISK_REDIS_PORT_6379_TCP=tcp://10.43.235.11:6379
KUBERNETES_PORT_443_TCP_ADDR=10.43.0.1
OPENWHISK_REDIS_SERVICE_HOST=10.43.235.11
OPENWHISK_CONTROLLER_SERVICE_PORT_HTTP=8080
OPENWHISK_APIGATEWAY_PORT_8080_TCP_ADDR=10.43.158.177
KUBERNETES_PORT=tcp://10.43.0.1:443
OPENWHISK_NGINX_PORT=tcp://10.43.108.19:80
OPENWHISK_NGINX_PORT_80_TCP_PORT=80
PWD=/
OPENWHISK_CONTROLLER_PORT_8080_TCP_PORT=8080
HOME=/root
OPENWHISK_REDIS_PORT_6379_TCP_ADDR=10.43.235.11
OPENWHISK_CONTROLLER_PORT=tcp://10.43.45.12:8080
DB_PASSWORD=8qgqiFKazAZH9AWz
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP_PORT=443
OPENWHISK_APIGATEWAY_PORT_8080_TCP_PORT=8080
ROUTER_HOST=openwhisk-nginx.openwhisk-blue.svc.cluster.local:80
NODE_VERSION=14.17.2
OPENWHISK_NGINX_PORT_80_TCP_ADDR=10.43.108.19
OPENWHISK_APIGATEWAY_SERVICE_PORT_MGMT=8080
KUBERNETES_PORT_443_TCP=tcp://10.43.0.1:443
OPENWHISK_NGINX_PORT_80_TCP=tcp://10.43.108.19:80
OPENWHISK_NGINX_PORT_443_TCP_PROTO=tcp
OPENWHISK_APIGATEWAY_PORT_9000_TCP_ADDR=10.43.158.177
OPENWHISK_NGINX_PORT_443_TCP=tcp://10.43.108.19:443
OPENWHISK_APIGATEWAY_PORT=tcp://10.43.158.177:8080
OPENWHISK_CONTROLLER_SERVICE_PORT=8080
TERM=xterm
DB_USERNAME=admin
OPENWHISK_REDIS_PORT=tcp://10.43.235.11:6379
OPENWHISK_NGINX_PORT_80_TCP_PROTO=tcp
OPENWHISK_NGINX_SERVICE_PORT_HTTP=80
OPENWHISK_CONTROLLER_PORT_8080_TCP=tcp://10.43.45.12:8080
OPENWHISK_NGINX_SERVICE_PORT=80
SHLVL=1
OPENWHISK_NGINX_SERVICE_PORT_HTTPS=443
KUBERNETES_SERVICE_PORT=443
OPENWHISK_REDIS_SERVICE_PORT_REDIS=6379
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
OPENWHISK_REDIS_PORT_6379_TCP_PROTO=tcp
KUBERNETES_SERVICE_HOST=10.43.0.1
DB_PREFIX=alm
OPENWHISK_APIGATEWAY_PORT_8080_TCP=tcp://10.43.158.177:8080
OPENWHISK_NGINX_PORT_443_TCP_PORT=443
DB_PROTOCOL=http
_=/usr/bin/env
ok. I will try to find time this weekend to verify the alarm provider still works for me locally. I had tested it while working on #698, but I also ended up doing and undoing several things in that commit to try to avoid needing #713 and its possible I ended up backing out something that fixed a problem with http/https in the alarm provider. I was experimenting with different options of how to configure that.
I have the same EPROTO error, did we find any solution? I can create cron triggers, but when invoked i get this error. It seem a misconfig IP error or https protocol used or even -insecure tag not working or even --auth error
I have recently faced the problem and solved it. In my case, the problem was that in utils.js in the provider folder the uri of the apiHost is created in this way: this.uriHost ='https://' + this.routerHost;. I have deployed openwhisk using this helm chart, and if you look at the yaml of the alarm provider deployment (here) the ROUTER_HOST and ENDPOINT_AUTH environment variables are using the INTERNAL api host name and port. This is a problem, since the internal port is 80 and does not provide any security option, but we are using https in the uri. The code should be patched in order to check if the provided port is a secure one, if it is not the used protocol should be http.
I have a cron trigger deployed in openwhisk:
However I get the following error in the alarmprovider pod when it tries to fire:
This seems to indicate the wrong protocol is being used (http vs. https). The problem is I'm not sure which URL it's using for this request.
If I exec into the pod and try curling the internal URL it seems to work fine:
If I try the external URL, it can't verify the certificate. However, the certificate is valid, which I double checked. It also seems like the error is not caused by certificate validation.
I also tried checking the environment inside the pod to see if I could find which URL it's using.
I tried the
10.43.133.144
address and that also seems to work from the CLI:Any idea what might be going on, or how I can figure out which URL it's using?