influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
28.94k stars 3.55k forks source link

bz1: system should skip empty meta files on startup #4629

Closed otoolep closed 8 years ago

otoolep commented 9 years ago

Start a single node, and run this program:

#!/usr/bin/python

import requests
import random
import time

seconds_in_30_years = 86400 * 365 * 30

for i in range(0,10000):
    startTime = random.randint(1, seconds_in_30_years)
    payload = "cpu value=%d %d\n" % (startTime, startTime * 1000000000)
    r = requests.post("http://localhost:8086/write?db=db&precision=s", data=payload)
    if i % 100 == 0:
        print '%d written' % i

Let it write a few hundred points, then stop all write load. Then on restart of the node, it generates a large series of empty meta files. E.g.

~/.influxdb/wal/db/default/73 $ ls -al
total 8
drwx------   2 philip philip 4096 Oct 30 15:19 .
drwx------ 131 philip philip 4096 Oct 30 12:45 ..
-rw-rw-r--   1 philip philip    0 Oct 30 15:19 000000.meta
otoolep commented 9 years ago

This is with the bz1 engine.

otoolep commented 9 years ago

cc @pauldix @beckettsean @rossmcdonald

otoolep commented 9 years ago

Actually, I might have misinterpreted the output incorrectly, looking into it. Definitely ending up with a bunch of 0-length files though.

otoolep commented 9 years ago

It is possible that the log messages may be somewhat deceiving, and are just opening all the WAL files. Why so many are 0-length is still to be determined.

otoolep commented 9 years ago

There is no real issue here, only one of interpretation. However, the system is doing a lot of unnecessary work.

My analysis tells me the following:

-- large numbers of writes across time result in large numbers of shards. -- if this number is big enough the log messages are emitted in such a high rate, with a somewhat deceiving text (containing the word "writing"). This gives the impression data is being written to the files. -- On startup, the system works through each meta file, only 1 per directory, flushes it, deletes it, and then recreates the metafile. However, since there are no more metafiles in the directory, the ID resets back to 0, and an empty file is recreated. -- On every start this process is continually repeated.

Or course, the system is not responsive until this cycle is complete, and takes needlessly long to start up.

So the behaviour of the system under a large number of parse writes is not great, but no data is lost. The code should probably be modified so if there is only a single meta file, and it is empty, do nothing.

pauldix commented 9 years ago

Thanks for looking into this @otoolep. I'm going to leave this open for now. We won't fix this on the bz1 storage engine, but it would be good to verify that this isn't an issue on the tsm1 storage engine once it gets set as the default.

otoolep commented 9 years ago

Yeah, I'll ensure the behaviour of tsm1 is sane in this case, before we release 0.9.5.

pauldix commented 8 years ago

Fixed on tsm with 0.9.6