Netflix-Skunkworks / service-capacity-modeling

Apache License 2.0
66 stars 19 forks source link

Unclear repetition #34

Open abersnaze opened 2 years ago

abersnaze commented 2 years ago

I'm working on summarizing the cost, cpu, disk (local & attached) for both regional and zonal clusters. I want there to be more consistency in the way repetition is represented.

us-east-1: # trimmed
us-west-2:
  least_regret:
    - candidate_clusters:
        total_annual_cost: # redacted
        zonal:
          - cluster_type: cassandra # trimmed
          - cluster_type: cassandra # trimmed
          - cluster_type: cassandra # trimmed
        regional:
          - cluster_type: dgwkv
            total_annual_cost: # redacted
            count: 3
            instance:
              total_annual_cost: # redacted
              name: r5.large
            attached_drives:
              - name: gp2
                size_gib: 20
                annual_cost_per_gib: # redacted
                annual_cost_per_read_io: # redacted
                annual_cost_per_write_io: # redacted

In the sample above there are:

jolynch commented 2 years ago

Yeah the tricky part is that regions may have two or more zones, and some services deploy regionally (in which case it's usually a single cluster) or zonally (in which case its O(n) clusters where n = that regions number of zones). I chose the most compact representation I could figure out that accurately describes all possible deployments. Feel free to improve it if you can (just remember that there are variable size of cluster per zone [e.g. in leader-follower dbs] and different cluster types, and possibly multiple clusters of the same cluster type (e.g. a metadata db and a data plane db of the same type).