Particular / ServicePulse

Production monitoring for distributed systems.
https://docs.particular.net/servicepulse/
Other
33 stars 27 forks source link

Implement Process SLA indicator and events #21

Closed dannycohen closed 10 years ago

dannycohen commented 11 years ago

As Opie, I would like to see an indication when a process did not complete within the specified timespan (or completed but exceeded the specified timespan).

Notes:

Acceptance tests:

Prep for acceptance tests:

  1. Based on the Video Store sample, define the following processes, using the UI provided by SI (see https://github.com/Particular/ServiceInsight/issues/33)
    1. Process 1:
      • Process name: "Customer Interaction"
      • Starting event: "SubmitOrder"
      • Completion event: "DownloadIsReady"
      • Must complete within 40 seconds
    2. Process 2:
      • Process name: "CRM Data Processing"
      • Starting event: "SubmitOrder"
      • Completion event: "OrderAccepted" in the "CustomerRelations" endpoint
      • Must complete within 60 seconds
    3. Process 3:
      • Process name: "Content Provisioning"
      • Starting event: "OrderAccepted" in the "ContentManagement" endpoint
      • Completion event: "DownloadIsReady"
      • Must complete within 10 seconds
    4. Process 4:
      • Process name: "Order Cancellation"
      • Starting event: "CancelOrder"
      • Completion event: "OrderCancelled"
      • "Must complete..." is not checked (no completion SLA)

Case 1: Successful completion

  1. Open Video Store sample in VS and submit an order
  2. Wait until the conversation completes, within 40 seconds
  3. Make sure no event is raised in ServicePulse

Case 2: Delayed completion

  1. Open Video Store sample in VS
  2. Change the BuyersRemorseIsOver timeout from 20 seconds to 50 seconds
  3. Submit an order
  4. Wait until the conversation completes (in a little bit more than 40 seconds)
  5. As soon as the "Customer Interaction" process completes (processing the DownloadIsReady message):
    1. the Process SLA indicator is red and the it number is incremented
    2. an event is raised (severity: error) indicating that: "'Customer Interaction' process instance [ProcessInstanceId] did not complete within specified time of 40 seconds (completed in 52 seconds)"

Case 3: SLA violation without completion

  1. Open Video Store sample in VS
  2. Change the BuyersRemorseIsOver timeout from 20 seconds to 600 seconds
  3. Submit an order
  4. Wait for up to 300 seconds
  5. Within 300 seconds:
    1. The Process SLA indicator is red and the it number is incremented
    2. an event is raised "Customer Interaction" process did not complete (processing the DownloadIsReady message), and an event is raised (severity: error) indicating that: "'Customer Interaction' Process instance [ProcessInstanceId] did not complete within specified time of 40 seconds.
    3. another event is raised: "CRM Data Processing" process did not complete, and an event is raised (severity: error) indicating that: "'CRM Data Processing' Process instance [ProcessInstanceId] did not complete within specified time of 60 seconds.
  6. After 600 seconds, the process completes, and the following events are raised:
    • "'Customer Interaction' Process instance [ProcessInstanceId] did not complete within specified time of 40 seconds (completed in 605 seconds)
    • "'CRM Data Processing' Process instance [ProcessInstanceId] did not complete within specified time of 60 seconds (completed in 603 seconds)

Case 4: SLA violation without completion (2nd example)

  1. Open Video Store sample in VS
  2. Add a delay in the ProvisionDownloadResponse of 11 seconds
  3. Submit an order
  4. Wait until the conversation completes
  5. As soon as the conversation completes:
    1. The Process SLA indicator is red and the it number is incremented
    2. an event is raised (severity: error) indicating that: "'Content Provisioning' Process instance [ProcessInstanceId] did not complete within specified time of 10 seconds.
  6. No other event is raised (since "CRM Data Processing" was not affected and completed within 60 seconds, and "Customer Interaction" process also completed within 40 seconds)

Case 5: Process Cancellation

  1. Open Video Store sample in VS
  2. Change the BuyersRemorseIsOver timeout from 20 seconds to 600 seconds
  3. Add a delay in the OrderCancelled of 60 seconds
  4. Submit an order
  5. Wait for up to 300 seconds
  6. Within 300 seconds:
    1. The Process SLA indicator is red and the it number is incremented
    2. an event is raised "Customer Interaction" process did not complete (processing the DownloadIsReady message), and an event is raised (severity: error) indicating that: "'Customer Interaction' Process instance [ProcessInstanceId] did not complete within specified time of 40 seconds.
    3. another event is raised: "CRM Data Processing" process did not complete, and an event is raised (severity: error) indicating that: "'CRM Data Processing' Process instance [ProcessInstanceId] did not complete within specified time of 60 seconds.
  7. After 400 seconds, click on the Cancel button to send a CancelOrder message
  8. The process is cancelled within ~460 seconds (400 waited, and 60 delayed in OrderCancelled processing)
  9. No additional events are raised
dannycohen commented 11 years ago

Note that the above acceptance tests do include the indicator being red and the number incrementing as processes violate their SLA's. It does not include a definition of how the indicator turns to be ing green again (more precisely: how Opie turns the indicator to green). This is defined in the issue for acknowledging the events: https://github.com/Particular/ServicePulse/issues/13

dannycohen commented 10 years ago

Closing for re-eval in requirement repo.