rycus86 / podlike

Co-located containers as Docker Swarm services (like Kubernetes pods)
MIT License
81 stars 17 forks source link

service-mesh example / 404 error on port 80 #1

Closed pascalandy closed 6 years ago

pascalandy commented 6 years ago

There is nothing on port 80.

It looks that only Traefik-router is deployed. I don't see the Traefik-proxy. From your design I feel we should see 4 Traefik services right ?

screen shot 2018-05-25 at 1 02 37 pm

screen shot 2018-05-25 at 1 02 32 pm

screen shot 2018-05-25 at 1 08 01 pm

rycus86 commented 6 years ago

Hi @pascalandy ,

Thanks for trying the example! I just tried on a single PWD machine, then on a 3 manager 2 worker PWD from the template, both of them seems to work OK. How did you add the 4 instances? I hope to be able to replicate it if I do it the same way as you did. :)

It's true though, that nothing is listening on http://127.0.0.1, I might add a default redirect there or an info page.

Thanks!

pascalandy commented 6 years ago

Oh, I tried with only 1 instance. Let me see.

rycus86 commented 6 years ago

Sorry, I missed the question about the 4 Traefik instances. There are 4, but 3 of them are running as a container started by the Swarm task. Swarm itself doesn't know about those directly, so they won't show up in docker service ls or docker service ps. If you check where the tasks are running currently, and do a docker ps for the running containers on that host, you'll see the components there, for example:

[manager2] (local) root@192.168.0.32 ~
$ docker ps
CONTAINER ID        IMAGE                    COMMAND                   CREATED             STATUS                   PORTS NAMES
57b91f5b89d8        traefik                  "/traefik --consulca…"    8 minutes ago       Up 8 minutes mesh_mul.1.ogjzg7xvscutpfdmgr56v4hkf.podlike.traefik
799c71e25864        traefik                  "/traefik --consulca…"    8 minutes ago       Up 8 minutes mesh_calc.1.vq979s5xvwaa2zrlohh85z2gb.podlike.traefik
4b44ec326b75        consul                   "docker-entrypoint.s…"    8 minutes ago       Up 8 minutes mesh_mul.1.ogjzg7xvscutpfdmgr56v4hkf.podlike.consul-agent
10befd947d2f        consul                   "docker-entrypoint.s…"    8 minutes ago       Up 8 minutes mesh_calc.1.vq979s5xvwaa2zrlohh85z2gb.podlike.consul-agent
d6627948a204        python:2.7-alpine        "python -c '\nimport …"   8 minutes ago       Up 8 minutes mesh_calc.1.vq979s5xvwaa2zrlohh85z2gb.podlike.app
f5ca53142ea2        python:2.7-alpine        "python -c '\nimport …"   8 minutes ago       Up 8 minutes mesh_mul.1.ogjzg7xvscutpfdmgr56v4hkf.podlike.app
46456897dee5        rycus86/podlike:latest   "/podlike -logs"          8 minutes ago       Up 8 minutes (healthy) mesh_calc.1.vq979s5xvwaa2zrlohh85z2gb
e547af820073        rycus86/podlike:latest   "/podlike -logs"          8 minutes ago       Up 8 minutes (healthy) mesh_mul.1.ogjzg7xvscutpfdmgr56v4hkf

This PWD node seems to host the calc and the mul "pods" currently.

pascalandy commented 6 years ago

I deployed using 3 Managers + 2 workers, same result.

$ curl -s http://127.0.0.1/v1/add/12/47
404 page not found

[manager1] (local) root@192.168.0.33 ~
$ curl -s http://localhost/v1/add/12/47
404 page not found

[manager1] (local) root@192.168.0.33 ~
$ curl -s http://127.0.0.1/v1/add/12/47
404 page not found

[manager1] (local) root@192.168.0.33 ~
$ curl -s http://127.0.0.1/v2/add/12/47
404 page not found
pascalandy commented 6 years ago

BTW, you might be interested in this project I built: https://github.com/pascalandy/docker-stack-this/tree/master/traefik_stack5

The main goal is to create the easiest and quickest docker stack available around the Docker Swarm ecosystem. I really wished I had this when I started (while back when docker-compose was called fig).

rycus86 commented 6 years ago

What is docker stack ps mesh --filter desired-state=running giving you? Also, try what you see on port 8500, the /ui/#/dc1/services endpoint should show all services as healthy.

pascalandy commented 6 years ago

Got it :)

There are 4, but 3 of them are running as a container started by the Swarm task.

pascalandy commented 6 years ago
$ docker stack ps mesh --filter desired-state=running
ID                  NAME                IMAGE                      NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
rtflejkgnxu8        mesh_zipkin.1       openzipkin/zipkin:latest   manager2            Running             Running 13 minutes ago
a6w9i3p8dmg4        mesh_consul.1       consul:latest              manager2            Running             Running 13 minutes ago
q2w4ro1e2abh        mesh_router.1       traefik:latest             worker2             Running             Running 13 minutes ago
bejznh9lo8ou        mesh_mul.1          rycus86/podlike:latest     worker1             Running             Running 13 minutes ago
qg4a9zexfmre        mesh_add.1          rycus86/podlike:latest     manager1            Running             Running 13 minutes ago
kohhcc0p4vml        mesh_calc.1         rycus86/podlike:latest     manager3            Running             Running 13 minutes ago

And yes the services looks all healthy

screen shot 2018-05-25 at 2 05 06 pm

rycus86 commented 6 years ago

BTW, you might be interested in this project I built: https://github.com/pascalandy/docker-stack-this/tree/master/traefik_stack5

Nice one, starred, and will have a play with them. :)

rycus86 commented 6 years ago

Is there a link to your PWD cluster I could look at quickly? Not sure if it can be shared, but just in case.

pascalandy commented 6 years ago

What you built here is clearly next level!

podlike reminds me when some people started to build fig (that became compose)

pascalandy commented 6 years ago

Sure, let's try this: https://labs.play-with-docker.com/p/bc44ov7ndhl0008fh5t0#bc44ov7n_bc44p1vndhl0008fh5tg

rycus86 commented 6 years ago

OK, so I've seen this happening here and there locally, but then sort of forgot about it... :) Traefik seems to miss or ignore some updates from Consulcatalog when the service switches from unhealthy to healthy, or maybe some other ordering of events coming from service discovery. Restarting one of the tasks kicked off a successful update and that fixed it. If I have time and manage to reproduce this reliably, I'd like to have a look at the Traefik code to see if I can submit a fix maybe for it. :)

pascalandy commented 6 years ago

Wow. You're good! I tried many stacks with Traefik + Consul and I was always going into a dead end. The cause you found might explain why I could never make it happen.

rycus86 commented 6 years ago

Thanks! :) If you check the code at this location and below, you can see that Traefik is trying to work out if the change event should reload the config or not: https://github.com/containous/traefik/blob/master/provider/consulcatalog/consul_catalog.go#L345

Looks like it might not work in all cases reliably.

rycus86 commented 6 years ago

OK, I think this should fix it: https://github.com/containous/traefik/pull/3395 :)

pascalandy commented 6 years ago

When it's fixed, I'll add it as a mono repo in the docker-stack-this project.

rycus86 commented 6 years ago

Eventually, it got fixed in https://github.com/containous/traefik/pull/3390 and is now released: https://github.com/containous/traefik/releases/tag/v1.6.3

pascalandy commented 6 years ago

Great! Will try.