Closed otoolep closed 8 years ago
This is with the bz1 engine.
cc @pauldix @beckettsean @rossmcdonald
Actually, I might have misinterpreted the output incorrectly, looking into it. Definitely ending up with a bunch of 0-length files though.
It is possible that the log messages may be somewhat deceiving, and are just opening all the WAL files. Why so many are 0-length is still to be determined.
There is no real issue here, only one of interpretation. However, the system is doing a lot of unnecessary work.
My analysis tells me the following:
-- large numbers of writes across time result in large numbers of shards. -- if this number is big enough the log messages are emitted in such a high rate, with a somewhat deceiving text (containing the word "writing"). This gives the impression data is being written to the files. -- On startup, the system works through each meta file, only 1 per directory, flushes it, deletes it, and then recreates the metafile. However, since there are no more metafiles in the directory, the ID resets back to 0, and an empty file is recreated. -- On every start this process is continually repeated.
Or course, the system is not responsive until this cycle is complete, and takes needlessly long to start up.
So the behaviour of the system under a large number of parse writes is not great, but no data is lost. The code should probably be modified so if there is only a single meta file, and it is empty, do nothing.
Thanks for looking into this @otoolep. I'm going to leave this open for now. We won't fix this on the bz1
storage engine, but it would be good to verify that this isn't an issue on the tsm1
storage engine once it gets set as the default.
Yeah, I'll ensure the behaviour of tsm1
is sane in this case, before we release 0.9.5.
Fixed on tsm with 0.9.6
Start a single node, and run this program:
Let it write a few hundred points, then stop all write load. Then on restart of the node, it generates a large series of empty meta files. E.g.