goss-org / goss

Quick and Easy server testing/validation
https://goss.rocks
Apache License 2.0
5.62k stars 472 forks source link

Goss returns 200 response although reporting failure. Please help #972

Open fernandino143 opened 2 months ago

fernandino143 commented 2 months ago

Hi there. I've been using goss for several years and it has always served me perfectly. You guys are bona fide heroes! However, this only started happening today (at least o my knowledge). We run a very old version of goss:

goss --version
goss version 0.3.23

And one particular ldap server was presenting an error:

cat /etc/goss/readiness.yaml
gossfile:
  /etc/goss/conf.d/readiness-*.yaml: {}
goss -g /etc/goss/readiness.yaml --vars /etc/goss/vars.yaml v -f tap 
1..4
not ok 1 - Port: tcp:636: listening: doesn't match, expect: [true] found: [false]
ok 2 - # SKIP Port: tcp:636: ip: skipped
ok 3 - Service: slapd: enabled: matches expectation: [true]
not ok 4 - Service: slapd: running: doesn't match, expect: [true] found: [false]

However, when I used curl, it returned 200:

curl -s localhost:8042/healthz -I
HTTP/1.1 200 OK
Content-Type: application/json
Date: Mon, 09 Sep 2024 14:24:42 GMT
Content-Length: 885

The conf is pretty simple:

cat /etc/goss/conf.d/readiness-ldap.yaml  | jq
{
  "port": {
    "tcp:636": {
      "ip": [
        "0.0.0.0"
      ],
      "listening": true,
      "title": "Service is listening on TCP port 636"
    }
  },
  "service": {
    "slapd": {
      "enabled": true,
      "running": true,
      "title": "The slapd service is enabled and running"
    }
  }
}

Any chance this is being cached some how? I noticed the changelog to v4, but I can't find the code to what "Calculated from first test start, this allows accurate reporting when showing a cached result" is talking about.

aelsabbahy commented 2 months ago

Hi there. I've been using goss for several years and it has always served me perfectly. You guys are bona fide heroes!

❤️ Much appreciated, glad it's provided you value.

So, just so I'm clear. The latest version is failing for you when curling and running goss validate? Or are goss validate and goss serve presenting different results?

fernandino143 commented 2 months ago

I'm not running the latest version. I'm still running "goss version 0.3.23", but validate shows there's something wrong, and serve is returning 200.

After I restarted the goss service, serve went back and picked up the same from validate and started reporting a 503. I don't think that's normal. Never seen that before.

fernandino143 commented 2 months ago

This just happened again now. Is there a way of adding logs or something that might explain what is going on with goss not returning the expected?

# curl -I -s localhost:8042/healthz 
HTTP/1.1 200 OK
Content-Type: application/json
Date: Wed, 11 Sep 2024 13:34:30 GMT
Content-Length: 885

# goss -g /etc/goss/readiness.yaml --vars /etc/goss/vars.yaml v -f tap -o verbose
1..6
not ok 1 - Port: tcp:636: listening: doesn't match, expect: [true] found: [false]
ok 2 - # SKIP Port: tcp:636: ip: skipped
ok 3 - Service: slapd: enabled: matches expectation: [true]
not ok 4 - Service: slapd: running: doesn't match, expect: [true] found: [false]
...
aelsabbahy commented 2 months ago

What happens when you curl without the -I does that show all the tests passing?

Is it reproducible or is it happening intermittently? Is there any chance that slapd is intermittently running/crashing/rerunning?

Also, does this happen with latest version of Goss? Interesting problem, if you have a way to reproduce this I can dig into it more.

fernandino143 commented 2 months ago

Sorry for taking so long to reply, but, I haven't seen the issue any more to test with -I yet.

I'm checking if I can get the client upgraded to the latest stable this next week across the fleet