Open ps48 opened 2 years ago
Synthetics Demo Video from @paulstn
https://user-images.githubusercontent.com/4348487/166390233-11dbe004-3408-4174-905b-e0fef43fb035.mov
TODO: add work breakdown/ pending issues
hey is this still tracking for 2.3?
Any plan in which version this functionality will be included?
@rafael-gumiero we don't have this feature in our priority right now. We are open to guiding community contributions. cc: @anirudha @paulstn
Synthetics Design Document
Code: https://github.com/opensearch-project/observability/tree/uptime
1. Overview
Synthetics is a new module in observability that enables users to monitor availability and response times of applications and services in real time. This tool provides the ability to understand availability and response time components of services and applications. Users can detect problems before they affect their end customers.
2. Motivation
Synthetics systems are useful for measuring stability, reliability and analysis of health on live systems. Continuous monitoring of micro-service based software systems is an essential component of observability. Synthetics opens up the door for infrastructure visibility by proactively pinging API endpoints. This can be considered as an auto-pilot extension to human observation, especially when merged with other OpenSearch capabilities namely: reporting, anomaly detection and alerting.
Synthetics can be utilized for the following use-cases:
3. How is it different from other plugins?
4. Requirements
4.1 Functional Requirement
4.1 Dashboards Observability
4.1.1 Synthetics Home
4.1.2 Test-Suite View
4.1.3 Add & Configure Test-Suite
Note: Username and passwords are stored for accessing an endpoint with HTTP basic authentication.
More detailed HTTP request:
The period specifies the quantity of time between each interval (has to be a number) and the unit specifies the unit of that time (has to be one of "weeks, "days", "hours", "minutes", and "seconds"). The job will trigger once and then after each interval occurs. Documentation
This is like 'cron' based scheduling, where the time in each component is the time in which the job triggers. There can either be a valid string put in or '*' can be used as a wildcard. So in the above example, the job will trigger at every minute on the 0th, 15th, 30th, and 45th second. Note that each component's default value is the wildcard so most components can be left out. Just looking up 'cron' will give more neccessary information on how to use this scheduler. Documentation
4.1.4 Certificates
4.1.5 Settings
4.2 OpenSearch Observability
4.2.1 Observability Scheduler
4.2.2 Endpoint Client
4.2.3 Indexing Client
.opensearch-observability
index..opensearch-observability
index.observability-synthetics-logs
index4.3 Optional requirements:
4.3.1 Plugin Integrations
observability-synthetics-logs
index. This index will store the logs generated by synthetics test-suites.4.3.2 Reporting
4.3.3 Anomaly Detection
4.3.4 Alerting
6. Architecture
6.1 Architecture 1
6.1.1 OpenSearch-Observability
Index client: As of OSS 1.2.0 observability backend plugin is already part of OpenSearch plugins. This backend has an indexing client to add/get/delete/update the observability index. We can extend this backend plugin to perform similar operations with new types of data model: s
yntheticsTestSuite, synthetics``Logs and syntheticsSettings
Endpoint client: This will be a new component added to the observability backend. This component will be responsible to perform all the API requests and store the result to the index using index client APIs. The component will also validate the response using PPL processors.
observability-synthetics-logs
. The user has the capability to run PPL queries on the request or response parameters, to check for values and validate if the pinged app or service is up. These PPL queries are called PPL processors. They’ll specifically run on each response of the endpoint. The other values added by PPL processor is as follows:Job Scheduler: The observability plugin will have its own abstract component of the OpenSearch job scheduler SPI. This will enable other observability components to use its extensible interfaces to register a job and run those with appropriate callbacks on completion of the job. Later, synthetics test-suites will use these interfaces to run the pings to respective endpoints. Apart from the scheduled synthetics requests, an auto-deletion job should be scheduled as a daily cron job to delete logs below a given threshold. The log age threshold may be changed in the setting page of synthetics.
6.1.2 Dashboards-Observability
observability-synthetics-logs
index.syntheticsRequests
and get a field e.g. response duration or even a collection of fields. Finally, users can configure their ML model over extracted fields.Pros
Cons
6.2 Architecture 2
6.2.1 OpenSearch-Observability
6.2.2 Dashboards-Observability
Pros
Cons
6.3 Architecture 3 [Preferred]
6.3.1 OpenSearch-Observability
.observability-index
.observability-synthetics-logs
index.6.3.2 Dashboards-Observability
observability-synthetics-logs
index and populates the pages.Pros
Cons
7. Miscellaneous
7.1 FGAC for OpenSearch
observability-synthetics-logs
must adhere to all access control permissions7.2 Options for having location in Synthetics:
Some potential solutions:
8. Data Model
8.1 syntheticsTestSuite (Architecture-1 [6.1])
.opensearch-observability
index.8.2 syntheticsTestSuite (Architecture-2 6.3)
8.3 syntheticsLogs
observability-synthetics-logs
index.[Optional] 8.4 syntheticsSettings
.opensearch-observability
index.9. OpenSearch/Plugins REST endpoints
9.1 Using pre-existing REST endpoints
9.1.1 Stats APIs
9.1.2 Other miscellaneous APIs
9.2 Health Check endpoints
10. Appendix
10.1 Alerting & Anomaly Detection
10.2 Reporting
10.3 Synthetic monitoring
11. References
11.1 https://geekflare.com/monitor-website-uptime/
11.2 https://www.datadoghq.com/uptime-monitoring-tools/
11.3 https://cabotapp.com/
11.4 https://github.com/arachnys/cabot#single-service-overview
11.5 https://alyvix.com/learn/introduction.html