influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
28.96k stars 3.56k forks source link

Missing tsm file in shard makes writes fail with (unclear) error message "[shard x] unexpected end of JSON input" #23628

Open mtroback opened 2 years ago

mtroback commented 2 years ago

Steps to reproduce:

  1. Harddrive is full preventing influxdb successfully writing data (log: error="write /var/lib/influxdb/engine/data/e7adcc11582b82f9/autogen/222/index/3/L1-00000001.tsi: no space left on device" path=/var/lib/influxdb/engine/data/e7adcc11582b82f9/autogen/222/index/3/L0-00000001.tsl)
  2. You end up with corrupt data/shard
  3. Restart influxdb after clean up of harddrive, there is now space to write to
  4. There is now a shard without a .tsm/.tsl file
  5. Try to write data to the time period of the above mentioned shard

Expected behavior: The data is successfully written (the database will handle the missing .tsm file and do what is necessary to be able to write new data).

Actual behavior: Receive error message (python): HTTP response body: {"code":"internal error","message":"unexpected error writing points to database: [shard 222] unexpected end of JSON input"} (log): Aug 16 13:08:05 COMPUTER influxd-systemd-start.sh[224631]: ts=2022-08-16T11:08:05.063037Z lvl=warn msg="Write failed creating shard" log_id=0cLaEUw0000 service=storage-engine service=write shard=222 error="opening shard previously failed with: [shard 222] unexpected end of JSON input"

Environment info:

Config: Default

Other info: Solved by deleting the shard folder in /engine/data/xxxx/autogen (in my case 222).

panickos commented 2 years ago

We had the same issue and couldn't find any documentation on how to resolve it.

Deleting the files helped get that message away but we also had issues with a "Locked file" error message as well. We also tried backup and restore, but that also failed with an "unexpected end of JSON input" error. We ended up having to reset the database and start from scratch.