oetiker / rrdtool-2.x

RRDtool 2.x - The Time Series Database
86 stars 8 forks source link

Support for writing rrd files or hierarchy into distributed filesystem #20

Open mnikhil-git opened 10 years ago

mnikhil-git commented 10 years ago

Hi,

I would like to have rrdtool support for writing rrds into (some fast) distributed filesystem, so lack of disk storage on one node or a single disk spindle/throughput issue should not be a concern.

Nikhil

luqasz commented 10 years ago

Ideally i think a better aproach is to creeate some partitioning/sharding of rrd database across multiple hosts. This will allow to scale horisontally without any pain. I am thinking of something similar to cassandra database.

oetiker commented 10 years ago

since rrdtool normally uses pre-allocated storage, lack of space should not be a problem ... and obviously can have as many rrd server hosts as you want ... splitting a single rrd file accross multiple hosts seems rather excessive ... please elaborate

luqasz commented 10 years ago

What i mean by saing "partitioning/sharding" is not splitting rrd file across multiple hosts. I have in mind some hashing (based on MD5 for examle) and placing rrd files on separate hosts, based on that hash. This will greatly reduce IO becouse of partitioning. More hosts less IO problem and more data can be put.

This "big table" like approach is used in opentsdb That is why it is so efficient and scalable.

One key note is how applications will communicate with rrd server/s. Placing a master host is a SPOF (single point of failure). I would personally "hide" that and delegate data distribution job to rrd server itself. To have some failover a replication factor may be used. For example factor of 2 will mean that same data will be placed on 2 hosts.

This will allow for easy horizontal scaling. As for horizontal scaling data consitency issue arises. Considder following example: 4 hosts. host 1 and 3 have same rrd files and host 2 and 4 have same data (for failover purpouses). Replication factor = 2. if the data comes in, it is keept in ram first. when host 1 is unavailable, data is only written to rrd file on host 3. when host 1 comes back, data may be read from backup host(host 3) and written to rrd file. this may collide with heartbeat parameter. Above example is just a tinny fraction of "things that may happen". This obviously means that more reasearch should be done. For example how to write data. The best would be to use sequenial read/writes and not random.

Now this may sound/look like a lot of work, but think of benefits.