thoughtworks / metrik

An easy-to-use, cross-platform measurement tool that pulls data out of CD pipelines and analysis the four key metrics for you.
MIT License
354 stars 87 forks source link

The back-end API boot is failing inside the container #99

Open wcalderipe opened 2 years ago

wcalderipe commented 2 years ago

Hey folks 👋

Instructions from the README to start the application don't work. The API boot is failing inside the container (see the logs on the "Additional context" section).

To Reproduce

Steps to reproduce the behavior:

  1. Run docker run -d -p 80:80 --name metrik public.ecr.aws/j2s5d3z8/4-key-metrics:latest
  2. Go to http://localhost:80/
  3. See the warning message on the top-right "Request failed with status code 502"

The browser's network inspector says the request to http://localhost/api/project is returning 502.

Expected behavior

Accordingly, with the README, after running the docker container, the application should be working at localhost:80.

Screenshots

image

image

Desktop

Additional context

Log files from the container:

These two seems the most relevant:

Other log files:

The NGINX configuration file says the API is running at port 9000 inside the container:

location /api/ {
  proxy_pass http://127.0.0.1:9000;

  # Extends timeout for data sync process
  # Should find a better way to sync data more efficiently
  proxy_connect_timeout       600;
  proxy_send_timeout          600;
  proxy_read_timeout          600;
}

See nginx_release.conf

Checking the API availability inside the container:

~/dev❯ docker ps
CONTAINER ID   IMAGE                                          COMMAND         CREATED      STATUS          PORTS                                                    NAMES
04d9d373df97   public.ecr.aws/j2s5d3z8/4-key-metrics:latest   "supervisord"   4 days ago   Up 20 minutes   9000/tcp, 0.0.0.0:80->80/tcp, :::80->80/tcp, 27017/tcp   metrik

Requests to http://127.0.0.1:9000 are failing

~/dev❯ docker exec -it 04d /bin/bash
root@04d9d373df97:/# curl http://127.0.0.1:9000
curl: (7) Failed to connect to 127.0.0.1 port 9000: Connection refused

Only port 80 is open inside the container:

root@04d9d373df97:/# lsof -i -P -n | grep LISTEN
nginx    21 root    6u  IPv4  20693      0t0  TCP *:80 (LISTEN)

NGINX logs:

==> nginx/error.log <==
2021/09/23 06:42:45 [emerg] 35#35: io_setup() failed (38: Function not implemented)
2021/09/23 06:42:45 [emerg] 31#31: io_setup() failed (38: Function not implemented)
2021/09/23 06:42:45 [emerg] 37#37: io_setup() failed (38: Function not implemented)
2021/09/23 06:43:00 [error] 36#36: *4 connect() failed (111: Connection refused) while connecting to upstream, client: 172.17.0.1, server: localhost, request: "GET /api/project HTTP/1.1", upstream: "http://127.0.0.1:9000/api/project", host: "localhost", referrer: "http://localhost/"
2021/09/23 06:43:53 [error] 36#36: *4 connect() failed (111: Connection refused) while connecting to upstream, client: 172.17.0.1, server: localhost, request: "GET /api/project HTTP/1.1", upstream: "http://127.0.0.1:9000/api/project", host: "localhost", referrer: "http://localhost/"

==> nginx/access.log <==
172.17.0.1 - - [23/Sep/2021:06:53:23 +0000] "GET /api/project HTTP/1.1" 502 559 "http://localhost/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36" "-"
172.17.0.1 - - [23/Sep/2021:06:53:54 +0000] "GET / HTTP/1.1" 200 539 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36" "-"
172.17.0.1 - - [23/Sep/2021:06:53:54 +0000] "GET /api/project HTTP/1.1" 502 559 "http://localhost/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36" "-"
172.17.0.1 - - [23/Sep/2021:06:53:54 +0000] "GET /favicon.svg HTTP/1.1" 200 562 "http://localhost/" "Mozilla/5.0

Application logs:

 ==> 4km-service-2021-09-19-1.log <==
    ... 194 more
Caused by: java.lang.ClassFormatError: Illegal class name "java/lang/Class[]" in class file org/springframework/data/mongodb/core/MongoTemplate$$EnhancerBySpringCGLIB$$59b863f0
    at java.base/java.lang.ClassLoader.defineClass1(Native Method)
    at java.base/java.lang.System$2.defineClass(Unknown Source)
    at java.base/java.lang.invoke.MethodHandles$Lookup.defineClass(Unknown Source)
    at java.base/jdk.internal.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.base/java.lang.reflect.Method.invoke(Unknown Source)
    at org.springframework.cglib.core.ReflectUtils.defineClass(ReflectUtils.java:509)
    ... 208 more

==> 4km-service.log <==
    ... 194 more
Caused by: java.lang.ClassFormatError: Illegal class name "java/lang/Class[]" in class file org/springframework/data/mongodb/core/MongoTemplate$$EnhancerBySpringCGLIB$$8461e687
    at java.base/java.lang.ClassLoader.defineClass1(Native Method)
    at java.base/java.lang.System$2.defineClass(Unknown Source)
    at java.base/java.lang.invoke.MethodHandles$Lookup.defineClass(Unknown Source)
    at java.base/jdk.internal.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.base/java.lang.reflect.Method.invoke(Unknown Source)
    at org.springframework.cglib.core.ReflectUtils.defineClass(ReflectUtils.java:509)
    ... 208 more
hyrepo commented 2 years ago

Hi @wcalderipe , thank you for providing such rich information on the issue, it looks so professional and gives us a good example of an issue template, respect!

Quick solution for this issue: wait for about 20 seconds then refresh the page.

This is actually a known issue, the root cause seems to be the MongoDB is not ready for connection in the initial few seconds after the application started, we'll implement a better solution to handle this when we have capacity.

wcalderipe commented 2 years ago

@hyrepo, thanks for the fast reply, and congrats on the awesome project 🙌 I'm really excited to try it out!

I've just tried to wait for 20 seconds as you suggested but, unfortunately, it doesn't seem to work.

~❯ date
Fri Sep 24 09:15:28 CEST 2021
~❯ curl http://localhost/api/project
<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx/1.21.2</center>
</body>
</html>
~❯ date
Fri Sep 24 09:17:34 CEST 2021
~❯ curl http://localhost/api/project
<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx/1.21.2</center>
</body>
</html>

Note there's a space of ~2 minutes between each curl request to /api/project.

Is there any other lead you have in mind? If you point me in the right direction I wouldn't mind trying it myself.

hyrepo commented 2 years ago

Hi @wcalderipe, I just had a closer look at this issue, it's not the one I mentioned above, in this one, the backend part failed to get started, so you keep getting 502 error.

Seems it was caused by the Docker in M1 chip.

The only environmental difference between mine and yours is the CPU, my Mac uses Inter chip, so I tested it again in an M1 Mac, then I got the same error as yours.

We'll try to fix it, but at the current stage, I'm not quite sure about the solution.

There're two temporary solutions that came into my mind:

wildan3105 commented 1 year ago

Hi, I'm also experiencing this but with a different machine.

And also, I'm trying to build the docker image and run the container by myself from this cloned repository. Here's what I did:

But I'm always getting the 502 Bad gateway everytime hit the API http://localhost:3000/api/project. I suppose there's something wrong with the MongoDB provisioning because I don't see any other error (in nginx and 4km-service log). Here's the error from MongoDB:

{"t":{"$date":"2022-11-11T02:25:25.293+00:00"},"s":"I",  "c":"REPL",     "id":21313,   "ctx":"initandlisten","msg":"Did not find local replica set configuration document at startup","attr":{"error":{"code":47,"codeName":"NoMatchingDocument","errmsg":"Did not find replica set configuration document in local.system.replset"}}}
{"t":{"$date":"2022-11-11T02:25:25.294+00:00"},"s":"I",  "c":"CONTROL",  "id":20714,   "ctx":"LogicalSessionCacheRefresh","msg":"Failed to refresh session cache, will try again at the next refresh interval","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2022-11-11T02:25:25.294+00:00"},"s":"I",  "c":"CONTROL",  "id":20711,   "ctx":"LogicalSessionCacheReap","msg":"Failed to reap transaction table","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2022-11-11T02:25:25.294+00:00"},"s":"I",  "c":"REPL",     "id":40440,   "ctx":"initandlisten","msg":"Starting the TopologyVersionObserver"}
{"t":{"$date":"2022-11-11T02:25:25.294+00:00"},"s":"I",  "c":"REPL",     "id":40445,   "ctx":"TopologyVersionObserver","msg":"Started TopologyVersionObserver"}
{"t":{"$date":"2022-11-11T02:25:25.294+00:00"},"s":"I",  "c":"NETWORK",  "id":23015,   "ctx":"listener","msg":"Listening on","attr":{"address":"/tmp/mongodb-27017.sock"}}
{"t":{"$date":"2022-11-11T02:25:25.295+00:00"},"s":"I",  "c":"NETWORK",  "id":23015,   "ctx":"listener","msg":"Listening on","attr":{"address":"0.0.0.0"}}
{"t":{"$date":"2022-11-11T02:25:25.295+00:00"},"s":"I",  "c":"NETWORK",  "id":23016,   "ctx":"listener","msg":"Waiting for connections","attr":{"port":27017,"ssl":"off"}}
{"t":{"$date":"2022-11-11T02:26:25.171+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"WTCheckpointThread","msg":"WiredTiger message","attr":{"message":"[1668133585:171654][10:0x7f34b0934700], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 43, snapshot max: 43 snapshot count: 0, oldest timestamp: (0, 0) , meta checkpoint timestamp: (0, 0)"}}
{"t":{"$date":"2022-11-11T02:27:25.244+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"WTCheckpointThread","msg":"WiredTiger message","attr":{"message":"[1668133645:244197][10:0x7f34b0934700], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 45, snapshot max: 45 snapshot count: 0, oldest timestamp: (0, 0) , meta checkpoint timestamp: (0, 0)"}}
{"t":{"$date":"2022-11-11T02:28:25.291+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"WTCheckpointThread","msg":"WiredTiger message","attr":{"message":"[1668133705:291122][10:0x7f34b0934700], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 46, snapshot max: 46 snapshot count: 0, oldest timestamp: (0, 0) , meta checkpoint timestamp: (0, 0)"}}
{"t":{"$date":"2022-11-11T02:29:25.321+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"WTCheckpointThread","msg":"WiredTiger message","attr":{"message":"[1668133765:321858][10:0x7f34b0934700], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 47, snapshot max: 47 snapshot count: 0, oldest timestamp: (0, 0) , meta checkpoint timestamp: (0, 0)"}}

What has possibly gone wrong from my steps above? Did I build the backend app correctly? I'm still having doubt at the moment whether my steps above are the same as what you guys did to build the public image on AWS.

Thank you very much.

andrecarlucci commented 11 months ago

+1, exactly the same 502 error on a M1 machine :(

rbeen commented 11 months ago

I ran into this exact same issue on my Apple M2 machine. Reading about nginx running on 9000 I decided to try something with the ports and simply added port 9000 to the run command:

docker run -d -p 80:80 -p 9000:9000 --name metrik --platform linux/amd64 public.ecr.aws/j2s5d3z8/4-key-metrics:latest

This seems to have fixed it!

wcalderipe commented 11 months ago

@rbeen I tried your command sample. Unfortunately, it still doesn't work on my end.

❯ docker run -d -p 80:80 -p 9000:9000 --name metrik --platform linux/amd64 public.ecr.aws/j2s5d3z8/4-key-metrics:latest                                                                                                                               
1dcb4f1d532850f832f22e0e30f2796ddfbe879271dcb05466fb0627ec5873cc
❯ docker ps                                                                                                                                                                                                                                           
CONTAINER ID   IMAGE                                          COMMAND                  CREATED         STATUS         PORTS                                                   NAMES
1dcb4f1d5328   public.ecr.aws/j2s5d3z8/4-key-metrics:latest   "supervisord"            4 seconds ago   Up 3 seconds   0.0.0.0:80->80/tcp, 0.0.0.0:9000->9000/tcp, 27017/tcp   metrik
image

Macbook Pro Apple M2 Max on macOS 13.4

wcalderipe commented 11 months ago

After a few minutes, it works. Exactly as @hyrepo mentioned.