ytyou / ticktock

TickTockDB is an OpenTSDB-like time series database, with much better performance.
GNU General Public License v3.0
76 stars 8 forks source link

Crash if you send a "second" timestamp when resolution is set to millisecond #73

Open kwando opened 3 weeks ago

kwando commented 3 weeks ago

If you send a second timestamp to a server that is running with millisecond resolution it crashes and hangs forever.

TickTockDB v0.20.0,  Maintained by
 Yongtao You (yongtao.you@gmail.com) and Yi Lin (ylin30@gmail.com).
 This program comes with ABSOLUTELY NO WARRANTY. It is free software,
 and you are welcome to redistribute it under certain conditions.
 For details, see <https://www.gnu.org/licenses/>.
Writing to log file: /home/kwando/log/ticktock.log
TickTockDB is ready...
ticktockdb-0.20.0-beta/bin/tt(+0xd158)[0xaaaab536d158]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0xffffb389e860]
ticktockdb-0.20.0-beta/bin/tt(+0x58a80)[0xaaaab53b8a80]
ticktockdb-0.20.0-beta/bin/tt(+0x3ea9c)[0xaaaab539ea9c]
ticktockdb-0.20.0-beta/bin/tt(+0x27a8c)[0xaaaab5387a8c]
ticktockdb-0.20.0-beta/bin/tt(+0x32b24)[0xaaaab5392b24]
ticktockdb-0.20.0-beta/bin/tt(+0x390d4)[0xaaaab53990d4]
ticktockdb-0.20.0-beta/bin/tt(+0x20914)[0xaaaab5380914]
ticktockdb-0.20.0-beta/bin/tt(+0x21290)[0xaaaab5381290]
ticktockdb-0.20.0-beta/bin/tt(+0x2e6f8)[0xaaaab538e6f8]
/lib/aarch64-linux-gnu/libstdc++.so.6(+0xdfadc)[0xffffb34dfadc]
/lib/aarch64-linux-gnu/libc.so.6(+0x8597c)[0xffffb32c597c]
/lib/aarch64-linux-gnu/libc.so.6(+0xeba4c)[0xffffb332ba4c]
Interrupted (11), shutting down...
Start shutdown process...
ylin30 commented 3 weeks ago

This is a serious bug. Let me see if I can repro it. Thx!

ylin30 commented 3 weeks ago

hm... I can't repro the crash. The following is the snapshot of my test. I kicked off v0.20.0 (main branch) with --tsdb.timestamp.resolution millisecond, wrote a data point with sec (10 digits timestamp) successfully, and queried it afterward.

[yi-IdeaPad ticktock (main)]$ git log
commit 85f6fa4b380792faba30d41aab4fdb52a3126c9f (HEAD -> main, tag: v0.20.0, origin/main, origin/HEAD)
Author: Yongtao You <yongtao.you@gmail.com>
Date:   Thu Apr 18 19:35:01 2024 -0700

    fix http-server issue

commit 45e1ef9c562935b4f1604e8b375496e8058595eb

....
[yi-IdeaPad ticktock (main)]$ make all
make: Nothing to be done for 'all'.
[yi-IdeaPad ticktock (main)]$ ./bin/tt -c conf/tt.conf --tsdb.timestamp.resolution millisecond --http.server.port 6182,6183 &
[1] 112683
 TickTockDB v0.20.0,  Maintained by
 Yongtao You (yongtao.you@gmail.com) and Yi Lin (ylin30@gmail.com).
 This program comes with ABSOLUTELY NO WARRANTY. It is free software,
 and you are welcome to redistribute it under certain conditions.
 For details, see <https://www.gnu.org/licenses/>.
Writing to log file: /home/ylin30/ticktock/log/ticktock.log
[yi-IdeaPad ticktock (main)]$ TickTockDB is ready...

[yi-IdeaPad ticktock (main)]$
[yi-IdeaPad ticktock (main)]$ ./admin/config.sh
{
  "http.server.port": "6182,6183",
  "tsdb.timestamp.resolution": "millisecond",
}
[yi-IdeaPad ticktock (main)]$ curl -s -XPOST 'http://localhost:6182/api/put' -d 'put testM1 1633412175 123 host=foo'
[yi-IdeaPad ticktock (main)]$
[yi-IdeaPad ticktock (main)]$ curl -s 'http://localhost:6182/api/query?start=1600000000&m=avg:testM1'
[{"metric":"testM1","tags":{"host":"foo"},"aggregateTags":[],"dps":{"1633720984655":123.0}}][yi-IdeaPad ticktock (main)]$
[yi-IdeaPad ticktock (main)]$ ./admin/ping.sh
pong
[yi-IdeaPad ticktock (main)]$

The crash might be triggered by something else. Could you please show me your repro steps?

kwando commented 3 weeks ago

Yeah, you are right. The milliseconds seems not to be the culprit. I can get it to crash regardless of that flag or not. This is what I do (running the aarch64 binary from the downloads page):

rm -rf data log
ticktockdb-0.20.0-beta/bin/tt --tsdb.timestamp.resolution millisecond

Then in another console run :

ticktockdb-0.20.0-beta/admin/put.sh 

a few times (2 or 3 mostly) and it crashes.

ylin30 commented 3 weeks ago

I confirmed that I can repro it. We will look into the issue ASAP.

[yi-IdeaPad ticktock (main)]$ ./bin/tt -c conf/tt.conf --tsdb.timestamp.resolution millisecond --http.server.port 6182,6183 &
[1] 114013
 TickTockDB v0.20.0,  Maintained by
 Yongtao You (yongtao.you@gmail.com) and Yi Lin (ylin30@gmail.com).
 This program comes with ABSOLUTELY NO WARRANTY. It is free software,
 and you are welcome to redistribute it under certain conditions.
 For details, see <https://www.gnu.org/licenses/>.
Writing to log file: /home/ylin30/ticktock/log/ticktock.log
[yi-IdeaPad ticktock (main)]$ TickTockDB is ready...

[yi-IdeaPad ticktock (main)]$
[yi-IdeaPad ticktock (main)]$
[yi-IdeaPad ticktock (main)]$
[yi-IdeaPad ticktock (main)]$
[yi-IdeaPad ticktock (main)]$
[yi-IdeaPad ticktock (main)]$
[yi-IdeaPad ticktock (main)]$
[yi-IdeaPad ticktock (main)]$ ./admin/ping.sh
pong
[yi-IdeaPad ticktock (main)]$
[yi-IdeaPad ticktock (main)]$ for i in {1..10}; do ./admin/put.sh ; done
Inserted: put test.metric 1724266750 123 host=host1
Inserted: put test.metric 1724266751 123 host=host1
./bin/tt(+0x70400)[0x560d923bc400]
/lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x7f9291b30090]
./bin/tt(+0x1645a)[0x560d9236245a]
./bin/tt(+0x42b5d)[0x560d9238eb5d]
./bin/tt(+0x6a4ea)[0x560d923b64ea]
./bin/tt(+0x5ea09)[0x560d923aaa09]
./bin/tt(+0x3d1fc)[0x560d923891fc]
./bin/tt(+0x618d0)[0x560d923ad8d0]
./bin/tt(+0x620cf)[0x560d923ae0cf]
./bin/tt(+0x5047c)[0x560d9239c47c]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xd6df4)[0x7f92919c5df4]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609)[0x7f929176a609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7f9291c0c353]
Interrupted (11), shutting down...
Start shutdown process...
ylin30 commented 3 weeks ago

Actually you are right. This is caused by sending second timestamp to server with millisecond setting. I ran debug mode and it failed immediately by assertion. We did add an optimization to try to recognize second/ms automatically but it seems there is a regression somewhere. Thanks for the reporting.

For now, please use second timestamp (10 digits) in writes if TT server uses default tsdb.timestamp.resolution (i.e., second), or millisecond (13 digits) if tsdb.timestamp.resolution=millisecond.

kwando commented 3 weeks ago

Thanks for looking into this :)