Open krmoos opened 1 year ago
Bjarke pointed me to this epic for posting a few thoughts on observability: We should ensure that we set up the tracing, so we can trace across services even when communication between them is done via asynchronous events and not synchronous http calls. This means that events should carry with them activity id and similar tracing attributes. Also, probably pretty important to collect metrics on stuff like messages in queues or throughput to discover if queues are growing and consumers cannot keep up.
Agree with @rvplauborg, we have done this preciously by ensuring that we track a correlation id across all domains for same "action".
It makes it easier to identify the error in the logs + see all events leading up to the error for the specific action.
@rvplauborg and @MadsDue, I'm not sure how the architects want to carve out these features. But I've just now created another one for tracing and diagnostics logging. https://app.zenhub.com/workspaces/epic-board-6375df2fd6f08e0015e1e0e6/issues/gh/energinet-datahub/green-energy-hub/489
@krmoos @rvplauborg Jeg har ikke nogen ide om hvor status er på denne her. Kan I hjælpe?
Hej @mogensjuul. Jeg er ikke rigtig inde over denne opgave, ud over den ene kommentar jeg skrev omkring observability, så må være dig svar skyldig..
Synopsis
As any DH3 stakeholder I want a production-grade implementation of the event-driven design So that business processes don't get stuck or go haywire And monitor in order to detect problems early And allow developers to quickly identify problems and make the system recover fast
Notes:
Acceptance Criteria
Tech. Notes
See the product teams initative in Confluence.
Testability
How to testEnviroment:User:Senario: