Closed earthgecko closed 10 years ago
The default statsd collector just passes the body directly to the statsd client: https://github.com/HubSpot/BuckyServer/blob/master/modules/statsd.coffee
Should it be splitting the input first?
Now there is a question :)
However from the statsd tcpdump maybe?? I am not sure
As I always say, "u know ur deep in it when u r using tcpdump - deepinit"
So actually conisdering from the looks of tcpdump you may have to because each entry appears seems to be that that way in tcpdump output, and if you are debugging in tcpdump, u know ur ... :)
So tcpdump seems to send a new line for each \n
me@here ~] cat /tmp/tcpdump.8125.log | grep "varnish.rpm.total" | tail
E...7/@.8.K^PUWE_..(.(......varnish.rpm.total:9882|c
E....P@.8..:PUVD_..(.8....n.varnish.rpm.total:9865|c
E....u@.9.L.U..N_..(........varnish.rpm.total:8720|c
E....c@.;..:...._..(.......@varnish.rpm.total:29309|c
.i_..(.8....u/varnish.rpm.total:29563|c
E.....@.;.."...E_..(........varnish.rpm.total:34810|c
.varnish.rpm.total:30836|c
varnish.rpm.total:30461|c
E....q@.;.z...9G_..(.-......varnish.rpm.total:34288|c
E...,@@.;......]_..(......~.varnish.rpm.total:30207|c
[me@here ~]
Considering these packets were sent via nc direct to statsd on udp in a multi-metric packet, they do appear to new lines in tcpdump per metric in the multi-metric packet.
[me@here ~]
[me@here ~] cat /tmp/tcpdump.8125.log | grep -B1 -A1 "varnish.rpm.total" | tail -n 20
--
17:00:10.673706 IP 123.124.125.9.56289 > a-statsd-server.8125: UDP, length 250
E.....@.;.."...E_..(........varnish.rpm.total:34810|c
varnish.rpm.weeds.someweed:5|c
--
E....L@.;..
.varnish.rpm.total:30836|c
varnish.rpm.weeds.someweed:2|c
--
E...b|@.;. ...3._..(.}....A
varnish.rpm.total:30461|c
varnish.rpm.weeds.someweed:1|c
--
17:00:11.185090 IP 634.213.57.711.48429 > a-statsd-server.8125: UDP, length 250
E....q@.;.z...9G_..(.-......varnish.rpm.total:34288|c
varnish.rpm.weeds.someweed:1|c
--
17:00:11.330388 IP 123.124.125.18.59872 > a-statsd-server.8125: UDP, length 250
E...,@@.;......]_..(......~.varnish.rpm.total:30207|c
varnish.rpm.weeds.someweed:2|c
[me@here ~]
However because I am not even seeing the data hit statds from BuckyServer I am thinking that BuckyServer is not sending it to statds in the first place.
I have verified a one to one test
send multi-metric packet to Bucky confirmed reciept in tcpdump tcpdump confirms no data sent on to statds
send single metric packet to Bucky confirmed reciept in tcpdump tcpdump confirms data sent on to statds
Easy workaround is to just go one request per metric, but we get loads of metrics and having all the metrics split into good multi-metric packets size really reduces the number of calls at scale if possible.
statds has the ability to receive multi-packet metrics (https://github.com/etsy/statsd/blob/master/docs/metric_types.md#multi-metric-packets)
gorets:1|c\nglork:320|ms\ngaugor:333|g\nuniques:765|s
In some other clients such as nc when you send this directly to statsd, you send that string as is with \n
, not with BuckyServer and this can be confusing.
With BuckyServer the multi-metric packets udp rules do not apply, there are no such limits via TCP.
BuckyServer gives you the ability to ship valuable statsd metrics more reliably longhual over TCP, rather than longhual over UDP. However this does have a price, you lose non-blocking fire and forget, but it does have a number of advantages too.
It is very important from the client side to be able to ship as many metrics per call as possible, with statds the packet size is limited by "total length of the payload within your network's MTU".
There you may have to split your metrics into string < max_paylod_bytes and loop through with nc hits to statsd many times (fire and forget).
With the HTTP client to BuckyServer can would send all of those metrics in one POST
with no \n
This is very advantageous if you are sending lots of critical long namespace metrics to statsd, especially from distributed, longhaul geographic regions.
BuckyServer simply submits each metric to statsd, it does not submit them to statsd as multi-metrics packets (so more UDP connections local, but reliable UDP).
With the HTTP client the post data must not have the \n
but rather a metric per line.
Good POST data
POST_DATA="test.bucky.alive:1|g
test.bucky.does_multi_metric_packets:1|c"
This example below would be submited to statsd, but it would NOT make it into statsd, meaning to would not be forwarded on to graphite, et al.
BAD POST data \n
BAD_POST_DATA="test.bucky.alive:1|g\ntest.bucky.does_multi_metric_packets:1|c"
We can confirm that BuckyServer receives the POST
with the multi-metric packet
and that statsd receives all the metrics with tcpdump.
On the BuckyServer start tcpdump
STATSD_PORT=8125
BUCKYSERVERPORT="your.buckyserver.port"
tcpdump -i any port $STATSD_PORT -A > /tmp/tcpdump.$STATSD_PORT.log &
tcpdump -i any port $BUCKYSERVERPORT -A > /tmp/tcpdump.$BUCKYSERVERPORT.log &
With an HTTP client like wget (or curl) to make it "non blocking" do not forget timeouts (no curl timeout was added in example)
BUCKYSERVER="your.buckyserver.ipaddress"
BUCKYSERVERPORT="your.buckyserver.port"
POST_DATA="test.bucky.alive:1|g
test.bucky.does_multi_metric_packets:1|c"
wget --tries=1 --timeout=1 --dns-timeout=1 --post-data="$POST_DATA" --header="Content-Type: text/plain" http://$BUCKYSERVER:$BUCKYSERVERPORT/bucky/v1/send
# or with curl
# curl -X POST -H "Content-Type: text/plain" -d "$POST_DATA" http://$BUCKYSERVER:$BUCKYSERVERPORT/bucky/v1/send
On the BuckyServer kill
tcpdump
kill `pidof tcpdump`
We can see that BuckyServer got the POST
and statsd got all the multi-packet metrics
(albiet via more udp connections), but you can now reliably forward statsd metrics longhaul via TCP.
cat /tmp/tcpdump.$BUCKYSERVERPORT.log | grep POST
cat /tmp/tcpdump.$STATSD_PORT.log | grep "bucky"
[me@here ~] kill `pidof tcpdump`
11 packets captured
13 packets received by filter
0 packets dropped by kernel
[me@here ~] 144 packets captured
146 packets received by filter
0 packets dropped by kernel
[1]- Done tcpdump -i any port 8125 -A > /tmp/tcpdump.8125.log
[2]+ Done tcpdump -i any port 8080 -A > /tmp/tcpdump.8080.log
[me@here ~] cat /tmp/tcpdump.8080.log | grep POST
..e.x..POST /bucky/v1/send HTTP/1.0
Access-Control-Allow-Methods: POST
[me@here ~]
[me@here ~] cat /tmp/tcpdump.8125.log | grep "bucky"
E.....@.@.<h.............o..test.bucky.alive:1|g
test.bucky.does_multi_metric_packets:1|c
[me@here ~]
The correct metrics and values are relayed to graphite and our valuable metrics have TCP transport
go BuckyServer you are my new favorite tool - reliable tcp transport for statds metrics - thanks guys!
etsy's tcp server type is not quite there yet, but BuckyServer sure is and the best thing is we can now test whether our valuable metrics were shipped longhual rather than fired and forgotten :) Because we can test the exit code AND and another really sweet thing is, if for some reason BuckyServer failed, we can just failover to nc'ing direct udp to statsd. Makes me smile. Nice way to end the week.
We currently push a lot of things to statds with multi-metric packets
I have confirmed with tcpdump that multi-metric submissions are not making it through to statsd but single metrics are. Is there any formatting or way to get multi-metric submission through the BuckyServer app and forwarded on to statds?