Hashring alignment and cache reads

dolvany commented 7 years ago

I would like to consult the trifecta of graphite wisdom @jjneely @deniszh @grobian regarding some general questions.

First, I would like to understand how to align the carbon-c-relay hashring with the buckytools hashring given the following configurations.

cluster cache1
  jump_fnv1a_ch
    srv1:2053
    srv1:2054
    srv2:2053
    srv2:2054
  ;

buckyd -hash jump_fnv1a srv1 srv2

Would this result in aligned hashrings even though carbon-c-relay is sending to multiple cache processes on each server?

Does it make sense to have cache instances dedicated to writing and cache instances dedicated to reading? Would this make reads and writes more performant?

deniszh commented 7 years ago

I don't think that it will work like this. Usually, people using 2-tier setup in that situation - one relay distribute metrics across servers, another relay on the host itself distributes metrics across carbon caches for load balancing. Or you can use go-carbon and get rid of 2-nd tier - it's performant enough.

Does it make sense to have cache instances dedicated to writing and cache instances dedicated to reading? Would this make reads and writes more performant?

It has no sense - if carbon daemon gets no writes then it will have no cache for reading.

dolvany commented 7 years ago

Hmm, judging by the docs, it seems that instance names may solve this...

cluster cache1
  jump_fnv1a_ch
    srv1:2053=a
    srv1:2054=b
    srv2:2053=c
    srv2:2054=d
  ;

buckyd -hash jump_fnv1a srv1:a srv1:b srv2:c srv2:d

After taking this for a test drive, it seems that buckyd doesn't like it: Error parsing hashring: strconv.ParseInt: parsing "a": invalid syntax

jjneely commented 7 years ago

You need to give buckyd the exact same host/port/instance strings as you do for carbon-c-relay. So:

./buckyd -hash jump_fnv1a srv1:2053=1 srv1:2054=b srv2:2053=c srv2:2054=d

dolvany commented 7 years ago

Ah, thx @jjneely. The readme seems a bit misleading on this syntax. Does it require an update?

SERVER
SERVER:INSTANCE
SERVER:PORT:INSTANCE

Also, when I use buckyd srv1 I get this output for bucky servers.

-bash-4.2$ bucky servers
Buckd daemons are using port: 4242
Hashing algorithm: jump_fnv1a:    0:srv1
Number of replicas: 1
Found these servers:
    srv1

Is cluster healthy: true
-bash-4.2$

But when I use buckyd srv1:2053=a srv1:2054=b I get this output for bucky servers.

-bash-4.2$ bucky servers
Buckd daemons are using port: 4242
Hashing algorithm: jump_fnv1a:    0:srv1      1:srv1
Number of replicas: 1
Found these servers:
    srv1
    srv1

Is cluster healthy: false
2017/09/15 18:44:33 Cluster is inconsistent.
-bash-4.2$

Not sure what is causing the cluster to go inconsistent or if it is something to be concerned about.

deniszh commented 7 years ago

You need to run buckyd on all instances, so e.g. two buckyd on same port on srv1 That's why I said:

I don't think that it will work like this.

dolvany commented 7 years ago

@deniszh One buckyd per carbon-cache instance, not per server? So, would the typical deployment only run one carbon-cache per server?

jjneely commented 7 years ago

I only run one carbon-cache / go-carbon daemon per server.

The way replication/load balancing works I want to make sure I have 2 copies of the same metric on different servers, and not assigned to 2 different daemons that happen to live on the same host. (I'll hopefully have some replication support in buckytools in the next month or so.)

In the far distant past I did run multiple carbon-cache daemons per server to handle my throughput, but the storage requirements grew so much that I had more disk IO than the ingestion could ever keep up with.

dolvany commented 7 years ago

Thx, @jjneely. Let me provide some more transparency regarding my goal. Below is the carbon-c-relay config. I am not using a replication factor, just duplicating the metrics to two separate clusters. I would like to use bucky to manage each cluster independently. Is reducing the number of carbon-cache instances on each server to one the only reasonable way to integrate bucky?

cluster c1
  jump_fnv1a_ch
    srv1:2053
    srv1:2054
    srv2:2053
    srv2:2054
  ;
cluster c2
  jump_fnv1a_ch
    srv3:2053
    srv3:2054
    srv4:2053
    srv4:2054
  ;
match *
  send to
    c1
    c2
  stop
  ;

jjneely commented 7 years ago

At this point, yes, that's the easiest way to that goal.

Although, I guess the real bug here is making bucky aware of multiple instances on the same physical host.

dolvany commented 7 years ago

There are presumably two things going on here, @jjneely:

Resolve a graphite key to a cluster member. It seems that this should work regardless of whether a cluster member appears twice (multiple carbon-cache instances).
Enumerate the cluster members. This seems to be problematic if a cluster member appears more than once (multiple carbon-cache instances). Is it enough to just ensure that entries in the cluster member list are unique?

I suppose my thinking is more along the lines of ignoring the fact that multiple instances are on the same physical host, except for hashring purposes.

dolvany commented 7 years ago

I made some tweaks to bucky client to support multi-instance. I removed the check to verify that the cluster members length equals the hashring length, since this would not be true if a cluster member has multiple hashring entries. I also removed duplicates from the servers slice. I have no idea if this is a breaking change for anything else, but bucky servers runs clean.

diff --git a/cmd/bucky/cluster.go b/cmd/bucky/cluster.go
index 4af0585..d22eabf 100644
--- a/cmd/bucky/cluster.go
+++ b/cmd/bucky/cluster.go
@@ -7,6 +7,7 @@ import (
 )

 import "github.com/jjneely/buckytools/hashing"
+import "github.com/krasoffski/gomill/unique"

 type ClusterConfig struct {
        // Port is the port remote buckyd daemons listen on
@@ -79,6 +80,7 @@ func GetClusterConfig(hostport string) (*ClusterConfig, error) {
                Cluster.Servers = append(Cluster.Servers, v.Server)
        }

+       Cluster.Servers = unique.Strings(Cluster.Servers)
        members := make([]*hashing.JSONRingType, 0)
        for _, srv := range Cluster.Servers {
                if srv == master.Name {
@@ -105,9 +107,9 @@ func isHealthy(master *hashing.JSONRingType, ring []*hashing.JSONRingType) bool
        // XXX: Take replicas into account
        // The initial buckyd daemon isn't in the ring, so we need to add 1
        // to the length.
-       if len(master.Nodes) != len(ring)+1 {
-               return false
-       }
+       // if len(master.Nodes) != len(ring)+1 {
+       //      return false
+       // }

        // We compare each ring to the first one
        for _, v := range ring {

grobian commented 7 years ago

I'd like to point out that unlike carbon_ch, fnv1a_ch does include port in its hash-key. Since you use that hash, I think there should be no such thing as "duplicate" cluster members or something. @jjneely wrote this imo in https://github.com/jjneely/buckytools/issues/26#issuecomment-329864487.

dolvany commented 7 years ago

Let me see if my assumptions are correct, @grobian. Please let me know if any of this is amiss. Bucky client derives the list of cluster hosts from the destinations in the hashring. Regardless of hash, the same cluster host can appear more than once in the hashring. Bucky client doesn't seem to like when it derives the same host more than once from the hashring. This begs a question. Can the same destination appear more than once in the hashring? Seems like it could provide a weighting factor for heterogeneous hardware. I am trying to figure out how to shoehorn carbon-c-relay and buckytools into a preexisting cluster which was scaled up with multiple instances of carbon-cache.

dolvany commented 7 years ago

I wonder if this is solvable in carbon-c-relay with a cluster of clusters approach.

cluster srv1
  jump_fnv1a_ch
    srv1:2053
    srv1:2054
  ;
cluster srv2
  jump_fnv1a_ch
    srv2:2053
    srv2:2054
  ;
cluster c1
  jump_fnv1a_ch
    srv1
    srv2
  ;
match *
  send to
    c1
  stop
  ;

buckyd srv1 srv2

deniszh commented 7 years ago

Regardless of hash, the same cluster host can appear more than once in the hashring

Does it? Not really sure.

Can the same destination appear more than once in the hashring?

IMO no - by definition of hashring.

I am trying to figure out how to shoehorn carbon-c-relay and buckytools into a preexisting cluster which was scaled up with multiple instances of carbon-cache.

That's an also a problem - I do not really understand what's your problem and what are you're trying to achieve?

dolvany commented 7 years ago

@deniszh, allow me to illustrate with a truncated example from the buckytools readme.

buckyd graphite010-g5:a graphite010-g5:b

It shows the same host, graphite010-g5, appearing multiple times in the hashring, once for each carbon-cache instance on the host. This is precisely the carbon-cache deployment that I have. The challenge I am having is that bucky servers fails the consistency check when I configure buckyd in this manner. Perhaps I am misunderstanding something fundamental here.

grobian commented 7 years ago

Could you perhaps describe your setup from the initial relay down to the carbon-cache processes? Getting a good idea about the flow of your metrics is key to get this right IMO.

dolvany commented 7 years ago

Sure, @grobian. Metrics->load balancer->multiple VMs with carbon-c-relay->multiple physical boxes each running multiple instances of carbon-cache. Carbon-c-relay config is identical on all VMs--consistent hash to all carbon-cache instances. I believe this is all working as intended--each graphite key is sent to a specific carbon-cache instance. Now, I am trying to integrate the cluster management piece.

grobian commented 7 years ago

Just summing up what has been said above to ensure we're all on the same page:

you use a jump hash to distribute metrics over your cluster
you have two such clusters
both clusters receive the same input (e.g. they are mirrors) and because they are of the same size, their distribution per server is identical (because jump hash doesn't care about the server name, port or key, just the final ordering)
sidenote, I recently applied this fix to c-relay: https://github.com/grobian/carbon-c-relay/commit/1c50590ffd392a31f795cfcba8c473cece1e5cb6 which should bring it back to the documented ordering, both bucky and c-relay need to agree on the ordering to have good operations
your clusters have multiple carbon-cache instances running on the same host, and this is directly visible in your cluster configuration (e.g. instances affect your hash ring)
you want to perform maintenance on your cluster using bucky

Due to 5. bucky and other tools get a tough job, because you probably /share/ the /var/lib/carbon/whisper directory amoung the multiple instances. It also makes future scaling out or down of instances on each server impossible because it will change the hash allocation (due to jump). To solve this, people typically use a c-relay on the storage server that simply any_of's all incoming metrics to a set of carbon-cache instances on the local host, thereby hiding any of this from tools like bucky. Your best start would be to implement this to be able to do 6. but it will cause metrics to move between srv1 and srv2 (and similar for srv3 and srv4 of course).

dolvany commented 7 years ago

1, correct. 2, correct. 3, I am not familiar enough with the inner workings of the hashes to say whether I completely understand your point regarding the final ordering--an example would certainly clarify this for me. 4, awesome. 5, correct. 6, correct. To put this in terms of configuration, it seems you are suggesting the following (leaving the mirror cluster out for brevity). Will this achieve hashring alignment across c-relay and bucky?

Front Relay

cluster c1
  jump_fnv1a_ch
    srv1:2052
    srv2:2052
  ;
match *
  send to
    c1
  stop
  ;

Back Relay srv1:2052

cluster c1
  jump_fnv1a_ch
    srv1:2053
    srv1:2054
  ;
match *
  send to
    c1
  stop
  ;

Back Relay srv2:2052

cluster c1
  jump_fnv1a_ch
    srv2:2053
    srv2:2054
  ;
match *
  send to
    c1
  stop
  ;

buckyd srv1 srv2

Also, I am curious if the use of multiple carbon-cache instances per host is common enough to solve the problem without the use of a second layer of relays. It seems like it would be trivial to support two layers of hashing in a single c-relay instance. Thoughts, @grobian?

grobian commented 7 years ago

You want to avoid having multiple tiers of (c-)relays, is that correct? While I understand the rationale, it currently isn't possible and I don't see it high priority to implement double-hashing or anything. Your config is indeed how it would look like. For performance and flexibility you could use any_of on the back relays instead of jump_fnv1a_ch. It shouldn't change anything for your situation.

dolvany commented 7 years ago

Sounds good, @grobian. Thanks for the guidance!

azhiltsov commented 7 years ago

@dolvany we had 12 carbon-cache processes per host with carbon-c-relay on the same host on front of them in order to distribute the load. At certain point it stopped work perfomance-wise and we switched to go-carbon which at current setup can easily handle 300K points/sec and with some tuning and external iSCSI storage up to 1000K points/sec sustained. It also eliminates carbon-c-relay on the host and you will be able to reduce amount of destinations in relay configs. It plays nice with bucky. Just have a look.

grobian commented 7 years ago

I would concur with azhiltsov's suggestion. carbon-cache.py isn't multi-threaded, hence running multiple in parallel. A c-relay in front of it is just a workaround, in reality it should've been multi-threaded by itself. go-carbon solves that nicely (and avoids the need for a local c-relay).

dolvany commented 7 years ago

@azhiltsov @grobian So, would go-carbon be fronted with c-relay? It looks like go-carbon is a replacement for carbon-cache. What would the design look like?

dolvany commented 7 years ago

@grobian If I use any_of on the back relay, would this result in some misalignment with CARBONLINK_HOSTS?

grobian commented 7 years ago

sender -> c-relay -> go-carbon

wrt CARBONLINK_HOSTS, I think that doesn't work at all anyway because fnv1a_jump_ch is not understood by graphite-web. This is the reason why we started carbonzipper. This "smart proxy" acts as a single carbon-cache to graphite-web, later we also replaced the latter with carbonapi.

dolvany commented 7 years ago

@grobian But I could use carbon_ch on everything and then CARBONLINK_HOSTS would align with the caveat that the distribution may not be as efficient as other hashes, yes?

grobian commented 7 years ago

No, only if you use a single cluster with carbon_ch, CARBONLINK_HOSTS will be able to predict where metrics are located. It understands replication IIRC, but it always probes in order, in other words, it's highly unsuitable for setups which need to be redundant/fault-tolerant and performant.

jjneely / buckytools

Hashring alignment and cache reads #26