Particular / ServicePulse

Production monitoring for distributed systems.
https://docs.particular.net/servicepulse/
Other
33 stars 27 forks source link

Implement Heartbeat Events viewing #12

Closed dannycohen closed 10 years ago

dannycohen commented 11 years ago

As Opie, when the heartbeat indicator is red, I want to see the event details so I can take corrective actions, and acknowledge the events after I took corrective actions.

Visualization:

  1. When clicking on the endpoint heartbeat indicator (#41) the list of active heartbeat events associated with that endpoint is displayed
  2. For each event , display:
    • Event type ("Heartbeat")
    • Creation timestamp
    • Endpoint instance id
    • Time since last heartbeat message was received
  3. A single heartbeat event is active per heartbeat outage
    • (i.e. assuming a heartbeat is expected every 30 seconds and its arrival is checked every 1 minute: if no heartbeat was received for 10 minutes. there will still be only 1 heartbeat event for that endpoint, and not 10 or 20 event for that specific endpoint)

Notes:

Demo / Acceptance tests:

Case 1:

  1. Deploy the heartbeat plugin in 3 of the 5 Video Store sample endpoints (in all except the ContentManagement and Operations endpoints)
  2. Run the Video Store sample
  3. Kill the "Sales" endpoint
  4. Indicator should turn red within 1 minute
  5. Number below Heartbeat Indicator should be 2 in green and 1 in red
  6. Click on heartbeat indicator
  7. The events list is displayed with 1 event added fro the missing "sales" heartbeat

Case 2:

  1. Deploy the heartbeat plugin in 3 of the 5 Video Store sample endpoints (in all except the ContentManagement and Operations endpoints)
  2. Run the Video Store sample
  3. Kill the "Sales" endpoint
  4. Indicator should turn red within 1 minute
  5. Number below Heartbeat Indicator should be 2 in green and 1 in red
  6. Click on heartbeat indicator
  7. The events list is displayed with 1 even added fro the missing "sales" heartbeat
  8. Create a new instance of the "Sales" endpoint
  9. In the dashboard, "Sales" endpoint Indicator should turn green within 1 minute (since it starts receiving heartbeat messages from the endpoint)
  10. Click on heartbeat indicator
  11. The events list is displayed with 1 even added informing that the "sales" endpoint resumed sending heartbeat messages
  12. In the dashboard, next to the "Sales" endpoint indicator (which is now green) there is a small icon (e.g. exclamation mark) indicating there are historical active events (i.e. "unacknowledged events")
dannycohen commented 11 years ago

Notes:

dannycohen commented 11 years ago

Wireframes (illustrating failed messages, but apply the same logic to heartbeats)

  1. https://particular.mybalsamiq.com/projects/operations/20.%20Dashboard%20-%20Main%20View
  2. https://particular.mybalsamiq.com/projects/operations/22.%20Dashboard%20-%20Endpoint%20Indicator
  3. https://particular.mybalsamiq.com/projects/operations/24.%20Dashboard%20-%20Alert%20details
indualagarsamy commented 11 years ago

@dannycohen - Can you please update this? I am assuming that this is tied to the user story that talks about what happens when you click the heartbeat indicator?

dannycohen commented 11 years ago

@indualagarsamy - Updated. please review and comment.

dannycohen commented 10 years ago

Should include this: https://github.com/Particular/ServicePulse/issues/13