ytyou / ticktock

TickTockDB is an OpenTSDB-like time series database, with much better performance.
GNU General Public License v3.0
72 stars 8 forks source link

older data not accessible, and data corruption after a few days #35

Closed Soren-klars closed 1 year ago

Soren-klars commented 1 year ago

Hi there, I used version 0.10.2 since the 21/Jan, writing ongoing minimal data for 2 electricity meters in. I also use Grafana for showing the data. This worked fine for 6 days. Today TT had trouble to return data, randomly only returning data for some out of my 4 charts. I decided to try to restart TT (in a clean way), but it was hanging to shut down for a few minutes. The first start after did not work, so I shut it down again and the next start after succeeded. The issue was that all the old data is missing now. Another thing what I tried yesterday was to access older data from the first day, and these requests run forever, no data returned at all, I mean the requests didn't get a response. Maybe these attempts also caused the issue I saw today. I'm a bit down to have lost my data again (last time was with the incompatible version upgrade. Anyway, I attached the data folder (maybe you could extract it and send it to me?), the log file and a screenshot of how it looked yesterday. And I must say that I appreciate all your work, I'm aware that it is still very much Beta. Have a lovely day Dashboard-showing-good-data data-backup_28-Jan.zip ticktock.log.zip

ylin30 commented 1 year ago

@Soren-klars I take a quick look at your data. There is a 1674259200.1674345600.temp, which is temporary compaction data for 1/20-1/21. The compaction didn't finish for some reasons. I suspect it may be a problem. But since the original data are still there. I think we should be able to recover them. I will try in my side and let you know.

I think you are using v2 compressor. But to gain more insights, would you please run /admin/config.sh and send the results over?

ytyou commented 1 year ago

Make sure all the previous ticktock processes are gone before restarting new ones. There might be ticktock processes from previous runs still hanging around.

Soren-klars commented 1 year ago

Hi @ylin30, I'm running TT as a service, and I do make sure the process is definitely close before I start it again. And btw., calling restart for this service doesn't work for some reason (it never finishes), so I'm always doing it in 2 steps. Here is the output of ./config.sh { "append.log.dir": "/home/sevi/programs/ticktock/append", "log.file": "/home/sevi/programs/logs/ticktock.log", "tsdb.compressor.version": "2", "tsdb.data.dir": "/home/sevi/programs/ticktock/data", "tsdb.flush.frequency": "5min", "tsdb.gc.frequency": "5min", "tsdb.page.count": "65536", "tsdb.rotation.frequency": "1d", "tsdb.timestamp.resolution": "second" }

ylin30 commented 1 year ago

@Soren-klars Good news, @ytyou tried and the data query returned data successfully. So your data is safe. In order to debug, could you please send the query manually and check(see the attached picture for sample query)? The query shown may not be exactly what Grafana issued. You can retrieve the query in browser. If Chrome, do this:

  1. right click on the panel,
  2. choose "inspect",
  3. then "network",
  4. then refresh page to issue queries, you can see list of query.
  5. Go to payload, you can see the query payload

image

image
ylin30 commented 1 year ago

@Soren-klars We also tried another form of query different from Grafana which uses Json format. It also works.

image

We suspect compaction might have a problem. Compaction is to save more disk spaces. For now, please disable it by adding this line into your config:

tsdb.compact.frequency = 0sec

You can also delete the 1674259200.1674345600.temp folder in your data dir. It is safe to delete. BTW, we didn't delete it when trying to repro your problem.

ylin30 commented 1 year ago

@Soren-klars we found a bug in compaction. We are planning a hotfix in a future release 0.10.4-beta. It will take a few days for fully testing. For now, please disable the compaction by

  1. adding tsdb.compact.frequency = 0sec into your config;
  2. remove all the *.temp folders in your data dir
  3. restart TT.

Your data should be still available as normal. Please try it out and let us know.

We are grateful for your help in bug report.

Soren-klars commented 1 year ago

Hi guys, thanks for looking into this so quickly. I will disable the compaction for now. And just for completion (I guess you don't need it anymore), here are the Grafana requests: POST http://192.168.2.52:3000/api/datasources/proxy/1/api/query 1.) {"start":1674950400000,"queries":[{"metric":"energy.kitchen","aggregator":"sum","downsample":"1m-avg","tags":{"direction":"consumed","type":"current-power"}}],"msResolution":false,"globalAnnotations":true} 2.) {"start":1674950400000,"queries":[{"metric":"energy.kitchen","aggregator":"sum","downsample":"1m-avg","tags":{"direction":"produced","type":"current-power"}}],"msResolution":false,"globalAnnotations":true} 3.) {"start":1674950400000,"queries":[{"metric":"energy.kitchen","aggregator":"sum","downsample":"1m-avg","tags":{"direction":"consumed","type":"kwh-last-10-min"}}],"msResolution":false,"globalAnnotations":true} 4.) {"start":1674950400000,"queries":[{"metric":"energy.kitchen","aggregator":"sum","downsample":"1m-avg","tags":{"direction":"produced","type":"kwh-last-10-min"}}],"msResolution":false,"globalAnnotations":true}

For now I stopped the service (which took again a long time to complete), emptied the data folder to start fresh, disabled the compaction, started again, and re-import the old data via python script. I'll let you know if there are any more issues...

Soren-klars commented 1 year ago

Could you explain to me please how to transfer the data to another computer in order to access it? I tried to copy over just the data folder, and even the complete 'ticktock' folder, but I'm always only getting: [] I tried version 0.10.2 and 0.10.3 How did you manage to access the data I sent you? I copied the data from my OrangePi PC running Debian onto my Laptop running Debian based Linux Mint. What else is necessary to open the same data folder in my local TT instance? I'm trying even the simplest query: curl 'http://localhost:6182/api/query?start=15d-ago&m=avg:1m-avg:energy.kitchen' which works when I point it at my OrangePi.

ylin30 commented 1 year ago

@Soren-klars You just need to simply copy the original backup data to the tsdb.data.dir and restart TT. Note:

  1. You need to clean up tsdb.data.dir if any other data there before.
  2. Must used compressor v2 since your backup data is in v2.
  3. restart TT after the original backup data have been copied into tsdb.data.dir.

Please see the snapshot below.

image

And here is the results of your query:

ticktock@329a2a4e1016:~/ticktock$ curl 'http://localhost:6182/api/query?start=15d-ago&m=avg:1m-avg:energy.kitchen'
[{"metric":"energy.kitchen","tags":{},"aggregateTags":["type","direction"],"dps":{"1674259260":125.519999999999996,"1674345600":1107.2654371733449352,"1674345660":132.4738622647991519,"1674432000":1017.3658448275862156,"1674432060":239.6486590909090921,"1674518400":777.8930431034482353,"1674518460":181.5699797895902634,"1674604800":845.8637758620689056,"1674604860":171.880630153276968,"1674691200":534.3077370689654799,"1674691260":191.8212500000000205,"1674777600":904.007183908046045,"1674777660":72.6581464646464639}}]ticktock@329a2a4e1016:~/ticktock$
Soren-klars commented 1 year ago

ok my bad, I assumed that if I don't specify the data folder location it would by default use the data folder in TT, but it didn't. But by specifying either an absolute path or just './data' it works. Sorry for that...

ylin30 commented 1 year ago

NP. We will add these steps into doc. Thanks.

ylin30 commented 1 year ago

ok my bad, I assumed that if I don't specify the data folder location it would by default use the data folder in TT, but it didn't. But by specifying either an absolute path or just './data' it works. Sorry for that...

V3 compressor has been fixed in version 0.10.4. InfluxDB line protocol has been supported since 0.11.0 (but please use 0.11.1 instead since there is a bug in 0.11.0).

In 0.11.1, the default data and log dirs are the current dir. So you won't have problems like permissions and confusion etc.

I closed this issue. Version 0.11.1 is the latest version to use.

Soren-klars commented 1 year ago

Thank you very much, great work. Are there any differences from a user's point of view? There don't seem to be any update to the docs. Soren

On Sun, 5 Mar 2023, 18:32 Yi Lin, @.***> wrote:

Closed #35 https://github.com/ytyou/ticktock/issues/35 as completed.

— Reply to this email directly, view it on GitHub https://github.com/ytyou/ticktock/issues/35#event-8668450706, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATHIXUUEPUAMRGBJN5TZMM3W2TL2BANCNFSM6AAAAAAUJRRLZ4 . You are receiving this because you were mentioned.Message ID: @.***>

ylin30 commented 1 year ago

For your original problem about compaction and v3 compressor, users don't need to do anything as long as you use the latest version, 0.11.1.

For InfluxDB line protocol support, please refer to this wiki. Unfortunately 0.11.1-beta can't work with Telegraf (Influxdb default client) in line protocol yet. You can only send writes in line protocol with CURLs or your own codes. We are working on that and hope to release it in 1-2 weeks.