Closed zeeshanakram3 closed 1 year ago
I did a POC integration of YT-synch with Elastic search logging stack with the following setup:
After the POC, all the logging data was indexed in Elasticsearch & was available in the Kibanna dashboard to execute queries OR create visualizations. For the Youtube-synch service, I divided the logging data that would be sent to ES into two groups: Monitoring data (for performance tracking) & Alering data (in case of service errors/exceptions)
The Youtube-synch application is broken down into 3 services, i.e. ContentDownloadService
, ContentCreationService
, and ContentUploadService
that are independently running and processing content that concerns them. So based on the application design, each service logs the content that they are processing.
Each Logs entry contains the videoId
(ID of the youtube video) + channelId
(The Joystream/YT channel Id the video belongs to)+ the action that the service was doing/executing + action timestamp
The following events will be logged & sent to Elasticsearch:
ContentDownloadService
ContentCreationService
ContentUploadService
ContentUploadService
The raw events grouped by each service could identify the performance/bottleneck of each service and hence could help in
selecting what enhancements to perform. Elastic search provides domain-specific query language (DSL) to execute queries on indexed logged data, so I think these raw event data points are generic enough and can be used to compute complex queries or create dashboards
Let me know if these logging events are sufficient or if we want to add more fields to the events
The following error events will be logged & sent to Elasticsearch to create the alerts based on the logs
QueryNodeApi
exceptions/errors (including network, syntax, or data errors)RuntimeApi
error in case of disconnect from Runtime API.RuntimeApi
error in case of video creation extrinsic failedStorageNodeApi
error in case of data-object upload failure to the storage node.ContentDownloadError
error in case of a failure while downloading video media from youtube.These events will also include the ID of the youtube video (videoId
) that was being processed when the errors occurred along with the timestamp
of the log entry.
Fantastic work! I think this is more than enough to start to use, and then we let real world problems guide any further enhancements if needed. Some questions
Watcher
, Elasticsearch Alerting
, Grafana
and ElastAlert
may be possible alternatives. Many of these seem to know how to deliver message all the way to final destination, like Slack, Telegram, email, etc.
- How do you propose we distribute all of this tooling to yt-synch operators in a way where it is both very easy to get this setup running?
First, If we assume that Elasticsearch infra is already setup then yt-synch operators only need to provide endpoint & the credentials for the ES instance, the yt-service would be sending data to for the indexing. Now, there are two ways yt-synch operators can setup the Elasticsearch infra, they can either use fully-managed Elasticsearch Cloud (it's easy to configure for non-technical operators), Or they can opt in for on-premises self-managed Elasticsearch & Kibanna instances. For the latter we can prepare the docker/docker-compose setup so that its minimum work as far as deployment & configuration of self-hosted Elastic stack is concerned
- What remains to go from POC to us using this in production?
Rgiht now I am using local instances of Elasticsearch & Kibanna (running on the same instance as Yt-synch service). However, it is advised to run the on different instance as in case of crash of the Yt-synch host machine the indexed logs is safe, If we want to go with the self-managed option than we can setup Elasticsearch & Kibanna on a dedicated server, Otherwise as mentioned earlier Elasticsearch provides fully manages option in the form of ElasticCould (with Kibanna already setup).
Obvisously for production setup, I need to setup basic alerting in case of exceptions.
- I believe there are plugins or tools that very easily allow for triggering external message pushing based on certain conditions being satisfied in the elastic search database
Yes, there are rules based aletrting options available, that you can setup to push the messages to configured destinations when the specific conditions are met. And you can setup these alerts from the Kibanna dashboard.
As stated