OneUptime / oneuptime

OneUptime is the complete open-source observability platform.
https://oneuptime.com
Apache License 2.0
4.77k stars 221 forks source link

Bug: Probe operational but no effects #1492

Open pschapler opened 3 months ago

pschapler commented 3 months ago

Describe the bug We installed the docker-based latest version of OneUptime and created a couple of monitors to check, most of the them of the "website" type. The probe docker images are online, the admin section shows a green status for them and so do the individual monitors. All of them show the green icon and the "operational" status.

However, neither of the probes show up with logs and for testing, we created a ping type monitor for an internal 10.0.0.x address that has to fail for there is no such ip address. Nevertheless, the probe image does not recognize this state and the status continues to be operational, the monitor never enters the failed state for the probing fails.

How to check / correct that?

To Reproduce Steps to reproduce the behavior:

  1. Create a ping monitor for a non-existing ip address
  2. Check that probes are operational
  3. Start monitoring for a short interval
  4. See error: the status of the monitor does not change

Expected behavior The probe images should recognize the missing availability of the ip address, update the status of the monitor and produce log files to check on.

Desktop (please complete the following information):

Deployment Type Thi is the self hosted, docker-based version of OneUptime.

simlarsen commented 3 months ago

Couple of things:

pschapler commented 3 months ago

Thanks for the reply.

Yes, the probes are assigned to the monitor in question, show a green "connected" status an are enabled. On the same page, when I click on "View Logs" it says "Not monitored yet" although it is active and the monitoring interval is 5 minutes.

I take it you mean the config.env file, the variable is LOG_LEVEL=DEBUG, will post probe logs...

pschapler commented 3 months ago

There is indeed something in the logs of the docker container:

oneuptime/probe:release

Probe is not registered yet. Skipping alive check. Trying to register probe again... Failed to register probe. Retrying after 30 seconds... APIException [Error]: Endpoint is not available at Function.getErrorResponse (/usr/src/Common/Utils/API.ts:346:15) at Function.fetch (/usr/src/Common/Utils/API.ts:327:38) at processTicksAndRejections (node:internal/process/task_queues:95:5) at async Function.post (/usr/src/Common/Utils/API.ts:258:16) at async Function._registerProbe (/usr/src/app/Services/Register.ts:111:54) at async Function.registerProbe (/usr/src/app/Services/Register.ts:86:17) at async runFunction (/usr/src/app/Jobs/Alive.ts:29:13) at async Task._execution (/usr/src/CommonServer/Utils/BasicCron.ts:24:13) { _code: 2 } Failed to register probe. Retrying after 30 seconds... APIException [Error]: Endpoint is not available at Function.getErrorResponse (/usr/src/Common/Utils/API.ts:346:15) at Function.fetch (/usr/src/Common/Utils/API.ts:327:38) at processTicksAndRejections (node:internal/process/task_queues:95:5) at async Function.post (/usr/src/Common/Utils/API.ts:258:16) at async Function._registerProbe (/usr/src/app/Services/Register.ts:111:54) at async Function.registerProbe (/usr/src/app/Services/Register.ts:86:17) at async init (/usr/src/app/Index.ts:31:13) { _code: 2 }

How can I check for the endpoint in question?

simlarsen commented 3 months ago

@pschapler can you please also update the software to latest and see if you face this?

simlarsen commented 3 months ago

Once you update the software to latest, you should see which endpoint was not being hit.

pschapler commented 3 months ago

Just did as you said - here is the new log entry:

Registering Probe... Sending request to: http://ingestor:3400/ingestor/register Failed to register probe. Retrying after 30 seconds... APIException [Error]: Endpoint is not available at Function.getErrorResponse (/usr/src/Common/Utils/API.ts:346:15) at Function.fetch (/usr/src/Common/Utils/API.ts:327:38) at processTicksAndRejections (node:internal/process/task_queues:95:5) at async Function.post (/usr/src/Common/Utils/API.ts:258:16) at async Function._registerProbe (/usr/src/app/Services/Register.ts:111:54) at async Function.registerProbe (/usr/src/app/Services/Register.ts:86:17) at async init (/usr/src/app/Index.ts:31:13) { _code: 2

simlarsen commented 3 months ago

Can you please check if ingestor container is up and running.

pschapler commented 3 months ago

Yes, it is up, but producing errors:

APIException [Error]: Endpoint is not available at Function.getErrorResponse (/usr/src/Common/Utils/API.ts:346:15) at Function.fetch (/usr/src/Common/Utils/API.ts:327:38) at processTicksAndRejections (node:internal/process/task_queues:95:5) at async Function.post (/usr/src/Common/Utils/API.ts:258:16) { _code: 2 } APIException [Error]: Endpoint is not available at Function.getErrorResponse (/usr/src/Common/Utils/API.ts:346:15) at Function.fetch (/usr/src/Common/Utils/API.ts:327:38) at processTicksAndRejections (node:internal/process/task_queues:95:5) at async Function.post (/usr/src/Common/Utils/API.ts:258:16) { _code: 2

pschapler commented 3 months ago

And

ingestor server started on port: 3400 Postgres Database Connected Connected to Redis Session Store! Redis connected on redis:6379 Clickhouse Database Connected: http://clickhouse:8123 Realtime socket server initialized APIException [Error]: Endpoint is not available at Function.getErrorResponse (/usr/src/Common/Utils/API.ts:346:15) at Function.fetch (/usr/src/Common/Utils/API.ts:327:38) at processTicksAndRejections (node:internal/process/task_queues:95:5) at async Function.post (/usr/src/Common/Utils/API.ts:258:16) { _code: 2 }