yuzutech / kroki

Creates diagrams from textual descriptions!
https://kroki.io
MIT License
2.88k stars 215 forks source link

Kubernetes automatically sets KROKI_PORT to tcp://ip:port causing a ClassCastException #576

Closed jkroepke closed 3 years ago

jkroepke commented 3 years ago

Hi,

i tried to start kroki (0.10 from docker) on out openshift 3.11 (kubernetes 1.11) instance.

The container just raise a ClassCastException on startup. On a local docker environment, everything works fine.

java.lang.ClassCastException: class java.lang.String cannot be cast to class java.lang.Number (java.lang.String and java.lang.Number are in module java.base of loader 'bootstrap')
    at io.vertx.core.json.JsonObject.getInteger(JsonObject.java:365)
    at io.kroki.server.Server.start(Server.java:127)
    at io.kroki.server.Server.lambda$start$1(Server.java:52)
    at io.vertx.config.impl.ConfigRetrieverImpl.lambda$getConfig$2(ConfigRetrieverImpl.java:182)
    at io.vertx.config.impl.ConfigRetrieverImpl.lambda$compute$9(ConfigRetrieverImpl.java:296)
    at io.vertx.core.impl.FutureImpl.dispatch(FutureImpl.java:105)
    at io.vertx.core.impl.FutureImpl.onComplete(FutureImpl.java:83)
    at io.vertx.core.impl.CompositeFutureImpl.onComplete(CompositeFutureImpl.java:131)
    at io.vertx.core.impl.CompositeFutureImpl.onComplete(CompositeFutureImpl.java:25)
    at io.vertx.core.Future.setHandler(Future.java:126)
    at io.vertx.core.CompositeFuture.setHandler(CompositeFuture.java:184)
    at io.vertx.config.impl.ConfigRetrieverImpl.compute(ConfigRetrieverImpl.java:275)
    at io.vertx.config.impl.ConfigRetrieverImpl.getConfig(ConfigRetrieverImpl.java:175)
    at io.kroki.server.Server.start(Server.java:48)
    at io.vertx.core.impl.DeploymentManager.lambda$doDeploy$9(DeploymentManager.java:556)
    at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:366)
    at io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Unknown Source)
{"timestamp":"1612175857778","level":"ERROR","thread":"vert.x-eventloop-thread-1","logger":"io.vertx.core.impl.launcher.commands.VertxIsolatedDeployer","message":"Failed in deploying verticle","context":"default","exception":"java.lang.ClassCastException: class java.lang.String cannot be cast to class java.lang.Number (java.lang.String and java.lang.Number are in module java.base of loader 'bootstrap')\n\tat io.vertx.core.json.JsonObject.getInteger(JsonObject.java:365)\n\tat io.kroki.server.Server.start(Server.java:127)\n\tat io.kroki.server.Server.lambda$start$1(Server.java:52)\n\tat io.vertx.config.impl.ConfigRetrieverImpl.lambda$getConfig$2(ConfigRetrieverImpl.java:182)\n\tat io.vertx.config.impl.ConfigRetrieverImpl.lambda$compute$9(ConfigRetrieverImpl.java:296)\n\tat io.vertx.core.impl.FutureImpl.dispatch(FutureImpl.java:105)\n\tat io.vertx.core.impl.FutureImpl.onComplete(FutureImpl.java:83)\n\tat io.vertx.core.impl.CompositeFutureImpl.onComplete(CompositeFutureImpl.java:131)\n\tat io.vertx.core.impl.CompositeFutureImpl.onComplete(CompositeFutureImpl.java:25)\n\tat io.vertx.core.Future.setHandler(Future.java:126)\n\tat io.vertx.core.CompositeFuture.setHandler(CompositeFuture.java:184)\n\tat io.vertx.config.impl.ConfigRetrieverImpl.compute(ConfigRetrieverImpl.java:275)\n\tat io.vertx.config.impl.ConfigRetrieverImpl.getConfig(ConfigRetrieverImpl.java:175)\n\tat io.kroki.server.Server.start(Server.java:48)\n\tat io.vertx.core.impl.DeploymentManager.lambda$doDeploy$9(DeploymentManager.java:556)\n\tat io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:366)\n\tat io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)\n\tat io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)\n\tat io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)\n\tat io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\tat java.base/java.lang.Thread.run(Unknown Source)\n"}
java.lang.ClassCastException: class java.lang.String cannot be cast to class java.lang.Number (java.lang.String and java.lang.Number are in module java.base of loader 'bootstrap')
    at io.vertx.core.json.JsonObject.getInteger(JsonObject.java:365)
    at io.kroki.server.Server.start(Server.java:127)
    at io.kroki.server.Server.lambda$start$1(Server.java:52)
    at io.vertx.config.impl.ConfigRetrieverImpl.lambda$getConfig$2(ConfigRetrieverImpl.java:182)
    at io.vertx.config.impl.ConfigRetrieverImpl.lambda$compute$9(ConfigRetrieverImpl.java:296)
    at io.vertx.core.impl.FutureImpl.dispatch(FutureImpl.java:105)
    at io.vertx.core.impl.FutureImpl.onComplete(FutureImpl.java:83)
    at io.vertx.core.impl.CompositeFutureImpl.onComplete(CompositeFutureImpl.java:131)
    at io.vertx.core.impl.CompositeFutureImpl.onComplete(CompositeFutureImpl.java:25)
    at io.vertx.core.Future.setHandler(Future.java:126)
    at io.vertx.core.CompositeFuture.setHandler(CompositeFuture.java:184)
    at io.vertx.config.impl.ConfigRetrieverImpl.compute(ConfigRetrieverImpl.java:275)
    at io.vertx.config.impl.ConfigRetrieverImpl.getConfig(ConfigRetrieverImpl.java:175)
    at io.kroki.server.Server.start(Server.java:48)
    at io.vertx.core.impl.DeploymentManager.lambda$doDeploy$9(DeploymentManager.java:556)
    at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:366)
    at io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Unknown Source)

Cause:

Kubernetes (in some situations, Docker, too). Set the en var KROKI_PORT to tcp://172.30.18.91:8000 if there is a service called kroki inside the same namespace:

/ $ echo $KROKI_PORT
tcp://172.30.18.91:8000

KROKI_PORT is also used by kroki itself to define the listing port.

I would recommended to change the name to avoid conflicts in the feature, e.g. KROKI_SERVER_PORT or KROKI_LISTEN_PORT.

If you, it would great to document this somewhere.

ggrossetie commented 3 years ago

Hi,

Kubernetes (in some situations, Docker, too)

I don't think that Docker will automatically define KROKI_PORT as an environment variable. I'm not familiar with Kubernetes but it seems a bit odd... why Kubernetes automatically defines ${SERVICE_NAME}_PORT with an URL and not a port!?

I guess the workaround is to explicitly define the environment variable KROKI_PORT or use another service name?

I would recommended to change the name to avoid conflicts in the feature, e.g. KROKI_SERVER_PORT or KROKI_LISTEN_PORT.

I could add an alias but I don't think that would solve this issue... so that would be a breaking change.

If you, it would great to document this somewhere.

https://docs.kroki.io/kroki/setup/configuration/

ggrossetie commented 3 years ago

For reference: https://kubernetes.io/docs/concepts/services-networking/service/#environment-variables

jkroepke commented 3 years ago

I don't think that Docker will automatically define KROKI_PORT as an environment variable.

https://docs.docker.com/network/links/#environment-variables

It's a rarely known feature. In the early days, Kubernetes wants to be docker compatible. Since docker links is deprecated, it might be not happen on docker anymore.

https://docs.kroki.io/kroki/setup/configuration/ I saw the docs, that was the hint why it could break on Kubernetes. What I mean with documentation was to document this case. Like if you are running on Kubernetes, you have to override the environment variable.

An another idea:

kroki could catch the exception here, add a log line like Could not parse KROKI_PORT, ignoring ... and start the server on the default port 8000.

This change is still non breaking, but it provides a "works out of the box setup" on platforms like Kubernetes.

ggrossetie commented 3 years ago

docs.docker.com/network/links/#environment-variables It's a rarely known feature. In the early days, Kubernetes wants to be docker compatible. Since docker links is deprecated, it might be not happen on docker anymore.

OH I didn't know that, thanks! I can indeed reproduce this issue using --link.

Start a container named kroki. It can be any image, for instance we can start a NGINX instance:

$ docker run --name kroki --rm nginx:alpine

Start a Kroki container using a link on the kroki instance (which in this case is a NGINX instance):

$ docker run --link kroki:kroki yuzutech/kroki:latest

The --link option will automatically define a few environment variables including ${name}_PORT=tcp://ip:port where ${name} is the link name, so effectively KROKI_PORT.

ggrossetie commented 3 years ago

An another idea: kroki could catch the exception here, add a log line like Could not parse KROKI_PORT, ignoring ... and start the server on the default port 8000. This change is still non breaking, but it provides a "works out of the box setup" on platforms like Kubernetes.

I was drafting a list of possible solutions including what you suggested:

Possible solutions

update the environment variable name to avoid conflicts

Despite the fact that this is a breaking change, it does not prevent conflicts. For instance, if someone use kroki-server it will conflict with KROKI_SERVER_PORT. Similarly, but arguably less likely, if someone use kroki-listen it will conflict with KROKI_LISTEN_PORT.

I wish Kubernetes/Docker will not pollute environment variables without using a unique prefix... but no :disappointed:

check if the KROKI_PORT is an integer otherwise log a warning message and use the default port (8000)

Not a big fan since I tend to prefer the "fail fast" approach. A warning message can easily go unnoticed and make it harder to debug.

check if KROKI_PORT starts with tcp://

It seems really specific but since both Kubernetes and Docker are using this convention we could check if KROKI_PORT starts with tcp://. If that's the case then we could ignore the value since in these runtime environments you don't really want to change the internal listening port. You will most likely use the --publish option on Docker and targetPort on the Kubernetes service definition to map the internal default port (8000) to an external port. My only concern is that I'm not 100% sure that the protocol will always be tcp://...?

check if we are running in a container

We could set an environment variable to indicate that we are running in a container:

KROKI_CONTAINER=

If this environment variable is present then we could ignore KROKI_PORT. Or at least do something specific to accommodate Kubernetes/Docker.

Potential workarounds

Thoughts?

jkroepke commented 3 years ago

Explicitly set KROKI_PORT=8000 (not sure that's working, I guess Kubernetes/Docker will override it)

I can confirm it works in Kubernetes. It looks like user provided environment variables will be preferred. I used this workaround to run kroki on our kubernetes.

update the environment variable name to avoid conflicts

About about KROKI_LISTEN? Like KROKI_LISTEN=0.0.0.0:8000 or KROKI_LISTEN=:8000 It may also help peoples running kroki outside a container. e.g. bind services to localhost and running them behind a reverse proxy is old pattern on virtual machines.

Since kubernetes use predefined suffixes this will be not conflicted kubernetes anymore.

ggrossetie commented 3 years ago

I can confirm it works in Kubernetes. It looks like user provided environment variables will be preferred. I used this workaround to run kroki on our kubernetes.

Thanks for your input. We should probably mention that in the documentation (even though it might have some side-effects since k8s won't expect this value).

About about KROKI_LISTEN? Like KROKI_LISTEN=0.0.0.0:8000 or KROKI_LISTEN=:8000 It may also help peoples running kroki outside a container. e.g. bind services to localhost and running them behind a reverse proxy is old pattern on virtual machines.

That's a good idea, I like it! I will play a bit with it as I don't really know which formats are supported by Vert.x.

ggrossetie commented 3 years ago

I started experimenting and here's a few edge cases:

Default port

KROKI_LISTEN=0.0.0.0:8000 and KROKI_LISTEN=:8000 are equivalent since the default host is 0.0.0.0. Should we also support the following syntax:

KROKI_LISTEN=127.0.0.1

It will be equivalent to KROKI_LISTEN=127.0.0.1:8000 since the default port is 8000.

IPv6

Apparently, the notation in this case is to encode the IPv6 IP number in square brackets:

KROKI_LISTEN=[2001:db8:1f70::999:de8:7648:6e8]:100

That's RFC 3986, section 3.2.2: Host

A host identified by an Internet Protocol literal address, version 6 [RFC3513] or later, is distinguished by enclosing the IP literal within square brackets ("[" and "]"). This is the only place where square bracket characters are allowed in the URI syntax. In anticipation of future, as-yet-undefined IP literal address formats, an implementation may use an optional version flag to indicate such a format explicitly rather than rely on heuristic determination. IP-literal = "[" ( IPv6address / IPvFuture ) "]"

So the supported formats are:

ipv4 and host cannot contain :.

Default port: 8000 Default host: 0.0.0.0

Thoughts?

jkroepke commented 3 years ago

As a sysadm, KROKI_LISTEN=0.0.0.0:8000 and KROKI_LISTEN=:8000 is different.

I would expect KROKI_LISTEN=0.0.0.0:8000 means, bind only to ipv4, KROKI_LISTEN=:8000 means bind to ipv4 and ipv6.

Taking at look at https://docs.oracle.com/javase/8/docs/technotes/guides/net/ipv6_guide/index.html

When bound to ::, the method ServerSocket.accept accept connections from both IPv6 or IPv4 hosts.

I would say KROKI_LISTEN=:8000 and KROKI_LISTEN=[::]:8000 are equivalent.

I would put [::] as default host.

ggrossetie commented 3 years ago

Interesting...

I would put [::] as default host.

I was suggesting to use 0.0.0.0 as the default value because that's what Vert.x is using when you calling server.listen(8000): https://github.com/eclipse-vertx/vert.x/blob/2b65dbb81ee495661cc782e314f1d7529ae1f022/src/main/java/io/vertx/core/http/impl/HttpServerImpl.java#L155-L157

But I guess we can mention in the documentation that KROKI_PORT=8000 should be replaced by KROKI_LISTEN=0.0.0.0:8000 (and not KROKI_LISTEN=:8000, unless you want to bind to IPv4 and IPv6 as you mentioned).

I will try to pass [::] to Vert.x to see if the value is recognized.

What do you think about using square brackets around IPv6? Does it make sense? Is this the way to go? And what about using a default port if the value is missing? Under this proposal, the following will work and listen on port 8000:

KROKI_LISTEN=[::]
KROKI_LISTEN=0.0.0.0
KROKI_LISTEN=kroki.io
KROKI_LISTEN=192.168.0.1
KROKI_LISTEN=[2001:db8:1f70::999:de8:7648:6e8]

Thanks for your input!

obitech commented 3 years ago

I'm still getting this exception on Kubernetes:

{"timestamp":"1625589452727","level":"ERROR","thread":"vert.x-eventloop-thread-0","logger":"io.vertx.core.impl.launcher.commands.VertxIsolatedDeployer","message":"Failed in deploying verticle","context":"default","exception":"java.lang.ClassCastException: class java.lang.String cannot be cast to class java.lang.Number (java.lang.String and java.lang.Number are in module java.base of loader 'bootstrap')\n\tat io.vertx.core.json.JsonObject.getInteger(JsonObject.java:168)\n\tat io.vertx.core.json.JsonObject.getInteger(JsonObject.java:435)\n\tat io.kroki.server.service.Blockdiag.<init>(Blockdiag.java:37)\n\tat io.kroki.server.Server.start(Server.java:104)\n\tat io.kroki.server.Server.lambda$start$1(Server.java:61)\n\tat io.vertx.core.impl.future.FutureImpl$3.onSuccess(FutureImpl.java:124)\n\tat io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:62)\n\tat io.vertx.core.impl.future.FutureImpl.addListener(FutureImpl.java:164)\n\tat io.vertx.core.impl.future.FutureImpl.onComplete(FutureImpl.java:132)\n\tat io.vertx.config.impl.ConfigRetrieverImpl.getConfig(ConfigRetrieverImpl.java:175)\n\tat io.kroki.server.Server.start(Server.java:57)\n\tat io.vertx.core.impl.DeploymentManager.lambda$doDeploy$5(DeploymentManager.java:196)\n\tat io.vertx.core.impl.AbstractContext.dispatch(AbstractContext.java:96)\n\tat io.vertx.core.impl.AbstractContext.dispatch(AbstractContext.java:59)\n\tat io.vertx.core.impl.EventLoopContext.lambda$runOnContext$0(EventLoopContext.java:37)\n\tat io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)\n\tat io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)\n\tat io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\tat java.base/java.lang.Thread.run(Unknown Source)\n"}

Even though I've set the KROKI_LISTEN env var:

            - name: KROKI_LISTEN
              value: "0.0.0.0"

Strangely I don't even have a kroki service:

Β± k get svc -n kroki-dev
NAME              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)           AGE
kroki-blockdiag   ClusterIP   xxx   <none>        8001/TCP,80/TCP   22m
kroki-gateway     ClusterIP   xxx     <none>        8000/TCP,80/TCP   22m
kroki-mermaid     ClusterIP   xxx    <none>        8002/TCP,80/TCP   22m

Is there anything else I need to configure?

ggrossetie commented 3 years ago

This is not the same issue but it's somehow related:

java.lang.ClassCastException: class java.lang.String cannot be cast to class java.lang.Number (java.lang.String and java.lang.Number are in module java.base of loader 'bootstrap')
  at io.vertx.core.json.JsonObject.getInteger(JsonObject.java:168)
  at io.vertx.core.json.JsonObject.getInteger(JsonObject.java:435)
  at io.kroki.server.service.Blockdiag.<init>(Blockdiag.java:37)
  at io.kroki.server.Server.start(Server.java:104)
  at io.kroki.server.Server.lambda$start$1(Server.java:61)
  at io.vertx.core.impl.future.FutureImpl$3.onSuccess(FutureImpl.java:124)
  at io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:62)
  at io.vertx.core.impl.future.FutureImpl.addListener(FutureImpl.java:164)
  at io.vertx.core.impl.future.FutureImpl.onComplete(FutureImpl.java:132)

For reference, here's the line:

https://github.com/yuzutech/kroki/blob/51e41125c4b2a33969c76215612c1996483728b6/server/src/main/java/io/kroki/server/service/Blockdiag.java#L37

It seems that Kubernetes is also interfering with the companion containers... 😞 Summoning @teochenglim @jkroepke because they have far more experience than I do! Did you encounter this error? Is there a workaround? Thanks!

jkroepke commented 3 years ago

@Mogztter

Sure, base on the service kroki-blockdiag kubernetes expose environment variables like KROKI_BLOCKDIAG_PORT, too.

Its described here (https://kubernetes.io/docs/concepts/services-networking/service/#environment-variables) for the service redis-master.

There are two ways to resolve that issue in general:

@obitech Your workaround would be something like define KROKI_BLOCKDIAG_PORT=8001 in your deployment, too.

obitech commented 3 years ago

@jkroepke thanks for your quick reply! Setting the ports explicitly via the env vars KROKI_BLOCKDIAG_PORT and KROKI_MERMAID_PORT fixed the issue πŸ™

ggrossetie commented 3 years ago

@jkroepke Thanks πŸ™ŒπŸ»

I'm glad it's working for you @obitech.

I will open a new issue to track progress.