Feature | extend how-it-works/monitoring/logs section with AWS ES vs EC2 Hosted ES

exequielrafaela commented 3 years ago

What?

Extend how-it-works/monitoring/logs section with AWS ES vs EC2 Hosted ES

# Considerations
- Concepts
  - Cluster, node, index, shard, segment, document
  - Masters, data, coordinators (client), ingestion, ML
  - Data Operations: Index, Delete, Update, Search
- Cost
  - Service
    - AWS
      - Free tier
      - Charged by instance size (EC2), volume size (EBS) and data transfer (standard)
      - Reserved Instances are supported
      - Snapshots are free
    - Self-hosted
      - Free tier
      - Charged by instance size (EC2), volume size (EBS) and data transfer (standard)
      - Reserved Instances and Savings Plans are supported
      - Snapshots are charged by storage size
  - Operation
    - AWS
      - Scaling can be achieved simply through AWS API/CLI
      - Tuning is more limited due to constrained configuration settings
      - Failed nodes are replaced automatically
    - Self-hosted
      - Requires more expertise for monitoring/optimization/tuning
      - Requires more automation for scaling
- Configuration
  - Cache settings
  - Thread settings
  - Security
  - XPack: not available in AWS
- Operation
  - Initial Setup
  - Self-healing
  - Updating/Patching
  - Scaling
  - Backups
  - Index Management (curator, ILM)
- Security
  - AuthN
    - AWS: IAM
    - Self-hosted: XPack
  - AuthZ
    - AWS: IAM + FGAC
    - Self-hosted: XPack
  - Encryption
    - At-rest
    - In-transit
    - Node-to-node
- Monitoring
  - AWS
    - CloudWatch: metrics (limited), dashboards, alarms, (limited logs)
  - Self-hosted
    - Node-Exporter => Prometheus/Grafana
    - ElasticSearch-Exporter => Prometheus/Grafana
    - Kibana XPack Monitoring
    - SemaText ES monitoring (paid)
  - Both
    - Kibana Xpack Monitoring: free plan is very basic
- Usage
  - Logs Ingestion
    - Quantity: number of log events per hour
    - Size: average number of bytes per hour
  - Storage
    - Use Quantity and Size to estimate the storage size that will be needed
    - With that we can also estimate backups storage size
  - Querying
    - How many simultaneous users will be accessing Kibana?
    - How many queries will each user run per hour?
- Capacity Planning
  - Storage (Volume)
    - Ingestion
      - Amount of data per unit of time
      - Eg: 500 KB/sec > 1265,6 GB/mo
    - Retention: how many days?
      - Eg: 30d
  - Compute (Throughput)
    - Search-bound vs Index-bound
      - For Centralized Logging we want a nice index rate; Search will be secondary
  - References
    - https://www.elastic.co/pdf/elasticsearch-sizing-and-capacity-planning.pdf
    - https://aws.amazon.com/blogs/big-data/best-practices-for-configuring-your-amazon-elasticsearch-service-domain/

Why?

To simplify Leverage adopter's the decision-making about ElasticSearch Managed vs Self-Hosted cloud solution implementation

exequielrafaela commented 3 years ago

Reflect this enhanced analysis https://binbash.atlassian.net/wiki/spaces/BDPS/pages/1789165573/Centralised+Logging

exequielrafaela commented 3 years ago

https://leverage.binbash.com.ar/how-it-works/monitoring/logs/#alternatives-comparison-table

binbashar / le-ref-architecture-doc

Feature | extend how-it-works/monitoring/logs section with AWS ES vs EC2 Hosted ES #69

What?

Why?