TheThingsNetwork / lorawan-stack

The Things Stack, an Open Source LoRaWAN Network Server
https://www.thethingsindustries.com/stack/
Apache License 2.0
975 stars 306 forks source link

Improve observability of application server integrations #4520

Closed adriansmares closed 2 years ago

adriansmares commented 3 years ago

Summary

We probably should improve the observability of application server integrations.

Why do we need this?

In order to provide feedback on how integrations behave.

What is already there? What do you see now?

Application packages based integrations have events that trigger on failure.

What is missing? What do you want to see?

  1. For packages that don't emit service data, probably an event that signals that the message has been processed, would help.
    • Do we want this for packages that emit service data just for symmetry ?
  2. Webhooks should emit an event when the request succeeds or fails.
    • The current internal pipeline carries only an http.Request, but we can carry some closure that signals the request status.

Environment

3.14 but not relevant

How do you propose to implement this?

It's a matter of just adding the events, which is not hard. The real question is if we want to actually add those.

How do you propose to test this?

Test the integrations with working / not working endpoints.

Can you do this yourself and submit a Pull Request?

Yes. cc @johanstokking / @neoaggelos for opinions on this.

johanstokking commented 3 years ago

In order to provide feedback on how integrations behave.

To whom?

If the audience are application owners, it would just be activity that it does work. If things don't work, we should already have error events. Webhooks that don't return a HTTP status code that is successful, should result in error events. Webhooks that timeout should result in error events.

If the audience are operators, they don't have access to these events, however, behavior and performance is relevant, in an aggregated way. So, metrics, in this case.

  1. Webhooks should emit an event when the request succeeds or fails.
  • The current internal pipeline carries only an http.Request, but we can carry some closure that signals the request status.

Yes

adriansmares commented 3 years ago

To whom?

That's indeed a good one. The target audience, from the original issue (https://github.com/TheThingsIndustries/lorawan-stack-support/issues/525) indeed seems to be application owners - we probably should focus on that.

Network operators already have quite some metrics, both regarding flow and errors.

johanstokking commented 3 years ago

OK, I see. So, in the case of webhooks, we would be publishing a success event that includes the compiled URL, status code and time?

In the case of application packages, there would be a specific success event for the concerning package?

I would be okay with this. It will add quite some events though, but these seem to me more relevant than some events that we currently have (most notably in NS), that we can strip if need be.

adriansmares commented 3 years ago

So, in the case of webhooks, we would be publishing a success event that includes the compiled URL, status code and time?

:+1:

In the case of application packages, there would be a specific success event for the concerning package?

:+1:

I would also add that probably the success events should not be visible in the Console unless verbose logging has been enabled.