influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
28.22k stars 3.51k forks source link

Compaction crash loops and data loss on Raspberry Pi 3 B+ under minimal load #11339

Closed ryan-williams closed 2 years ago

ryan-williams commented 5 years ago

Following up on this post with a fresh issue to highlight worse symptoms that don't seem explainable by a db-size cutoff (as was speculated on #6975 and elsewhere):

In the month since that post, I've had to forcibly mv the data/collectd directory twice to unstick influx from 1-2min crash loops that lasted days, seemingly due to compaction errors.

Today I'm noticing that my temps database (which I've not messed with during these collectd db problems, and gets about 5 points per second written to it) is missing large swaths of data from the 2 months I've been writing to it:

The last gap, between 1/14 and 1/17, didn't exist this morning (when influx was still crash-looping, before the most recent time I ran mv /var/lib/influxdb/data/collectd ~/collectd.bak). That data was just recently discarded, it seems, possibly around the time I performed my "work-around" for the crash loop:

sudo service influxdb stop
sudo mv /var/lib/influxdb/data/collectd ~/collectd.bak
sudo service influxdb start
influx -execute 'create database collectd'

The default retention policy should not be discarding data, afaict:

> show retention policies on temps
name    duration shardGroupDuration replicaN default
----    -------- ------------------ -------- -------
autogen 0s       168h0m0s           1        true

Here's the last ~7d of syslogs from the RPi server, 99.9% of which is logs from Influx crash-looping.

There seem to be messages about:

Is running InfluxDB on an RPi supposed to generally work, or am I in uncharted territory just by attempting it?

wollew commented 5 years ago

I don't know if it is supposed to work but I can definitely reproduce this issue on my Raspi 3B+.

ryan-williams commented 5 years ago

I've given up trying to keep it running.

My planned next steps, whenever I have time, are:

wollew commented 5 years ago

If you're willing to build InfluxDB yourself, you could try the branch in pr #12362

aemondis commented 5 years ago

My RPI 3B+ with InfluxDB has just started being hit by this same issue, ironically also while collecting environmental data. I however haven't yet started losing data, but I'm getting the endless crash loops filling up syslog with the same memory allocation and compaction issue(s).

12362 makes mention of this issue too in relation to mmap on 32-bit platforms, due to the limited allocatable memory. I'm too debating my options... as it runs on a RPI because I don't want a hot and power hungry server for what should be a simple function!

ryan-williams commented 5 years ago

Thanks for the corroboration!

FWIW, I no longer think this is due to the 32-bit platform / max memory problem.

I don't remember the details, but I think I saw a DB get past that size on my RPi, and also see this crash start well before anything should have been hitting that limit.

I've seen evidence of disk write failures or slowness possibly causing the initial problem (files quarantined with .bad extensions in Influx's data directories).

It seems like we're stuck until one of us captures the full state of one of these failing deployments, and someone who knows how to parse that state has a look…

aemondis commented 5 years ago

I do find it seems to go through fits and starts... it runs fine for several days, then suddenly services start dropping offline. I'm actually thinking to modify the service file to limit scheduler priority (via chrt) to ensure influx cannot consume all the resources of the system, as I get the feeling the load average spiking as high as it does cannot be helping the situation, since the CPU on the Pi tends to become near-unusable under high load. I have also been finding that under high I/O caused by Influx, that I start getting issues with corruption in some places on the SD card...

My influx db sits on an external USB3-based SSD (an old Intel X25-M 74GB SSD, so by no means fast, but definitely a million times faster than the SD card!) - so I don't think the disk I/O is an issue in my case.

Perhaps if you are finding corruption and influx is sitting on the SD, it could be the same issue as me with the high CPU... but perhaps also give a different SD a try? SD cards aren't really designed for heavy write activity, as they don't have any smarts to clean up deleted files with trim etc...

I have also seen many cases of failed SD cards on thin clients due to antivirus definitions, so I know SD card failures are definitely not unheard of...

ryan-williams commented 5 years ago

gtk you are seeing this, or something similar, on an SSD.

I switched influx to a USB thumb drive in my pi, when I was suspecting it was an SD card IO issue. It should be an order faster than the SD card, but I saw the issue similarly on both.

akors commented 5 years ago

Hi, just to share my experiences for anybody affected by the issue. I had a database that was about 1.5 GB in size, and influxdb would keep crashing on me.

The swap space can be disabled or reduced after that.

After reaching about 900 MB and starting to swap, the memory usage actually dropped back down and InfluxDB is using only 200MB. All my data looks like it has been retained, at least from before InfluxDB started crapping out two weeks ago.

pinkynrg commented 5 years ago

@alexpaxton I see you are one of the programmer who has contributed the most to the Influxdb project.

Sorry to drag you into the conversation here, but I would love to see some more attention to this post. Would you be so kind to somehow address the issue?

This is similar issue: https://github.com/influxdata/influxdb/issues/6975 And this is pull request that might fix the issue: https://github.com/influxdata/influxdb/pull/12362

It would be good to know if we can stick to influx with our RPIs or not.

Moving to another engine would be the worst for a lot of us and I feel a very good amount of people use influx on RPIs for IoT projects.

Please let us know and thank you very much in advance.

fluffynukeit commented 5 years ago

@ryan-williams , how large is each of your uncompressed shard groups under the default retention policy of 168 hours? Using mmapped TSM files, compaction jobs can grab a large chunk of the process address space because they are writing out new TSM files (presumably mmapped) while reading other TSM files (also mmapped). Just ballparking, but you end up needing 2x the mmapped address space: one for TSM inputs and one for TSM outputs. So if your shard group duration is large, resulting in a large files size, you can hit your mmap limit during a compaction job when otherwise you'd have enough headroom in the process address space.

If this is the issue you are encountering, I think #12362 will definitely help you. You might also or alternatively need to change the default retention policy so that the shard group duration is much smaller, so your compaction jobs are handling a much smaller amount of data at a time.

pinkynrg commented 5 years ago

@fluffynukeit, what do you think is the uncompressed shard group size limit at an 168 hours RP? I thought the problem was total size of the database.

fluffynukeit commented 5 years ago

@pinkynrg I think it will depend on how many TSM files you have and how much data you collect in that 168 hours. By default, TSMs all get mmapped to the process address space. So you could have a situation where a compaction job for 168 hours RP works fine for a nearly empty database, but eventually the size of all the TSM files could be large enough that the compaction job fails because there is not enough address space to do it.

Avoiding so much mmapping was one of the motivating reasons for #12362. My use case is that I wanted to keep my data forever on a device with a projected 10 year lifespan. Even if I made my shard group duration a tiny 1 hour (making compaction jobs very small), I would still eventually hit the address space limit as my database filled up with more and more TSM files.

pinkynrg commented 5 years ago

I would like to collect ~5k tags every minute for 90 weeks. That would also need to get downsampled every 1h and 1d.

I would then route my queries to best bucket (minute, hour, day), depending on the time delta of the query.

I was waiting to size my shards in the best possible way. Right now they are all 7d long.

fluffynukeit commented 5 years ago

What matters is the MB size of the TSM files for each shard group. I'd guess that 2x this size is the upper bound of address space needed for a compaction job. TSM size is not easy to predict because the data get compressed, so you just have to test it out and measure it. In my case, if you look at the logs on #12362, the uncompressed shard group is about ~400-450 MB. So let's assume a compaction job requires 900 MB of address space. With an empty DB, there are no mmapped TSM files, so your process address space is close to empty, and there is much more than 900 MB free. The compaction job runs.

Over time, the older shards will stop getting compacted, but they will still take up address space. Let's say you have 15 shard groups each with 200 MB in them, plus an uncompacted hot shard of 450 MB. That's 3.45GB of address space taken up by database data. If your user-space address limit is 3.6 GB, the next compaction job will likely fail because there's not enough free address space to run it. It would need an additional 450 MB to mmap the compaction job output file.

Don't take my size figures as gospel. I'm just making up numbers to be illustrative. You'll have to test it out for your own data and tune it appropriately. Or use #12362.

pinkynrg commented 5 years ago

Ok will test #12362.

You confirm that it has been working fine for you so far correct? No errors at all so far?

fluffynukeit commented 5 years ago

I have not encountered any problems, but I also have not tested it exhaustively. Our device is still in development.

pinkynrg commented 5 years ago

In the mean time I think I will also try an unofficial image for RPI to use all 64 bit.

https://wiki.debian.org/RaspberryPi3

@fluffynukeit, that should technically resolve the issue too, correct?

UPDATE:

weren't able to try a 64 bit OS because, as predicted, it ends up using almost double the memory for other processes (such as Gunicorn web server for example) so even if it solved the InfluxDB problem it wouldn't be a good final solution anyway.

stuartcarnie commented 5 years ago

tl;dr

Server memory resources were low enough that newly compacted TSM files were unable to be mmaped during the final phase of a compaction.

The log files were analyzed and it was determined that a low memory condition (trace_id=0Cy_LzMW000) resulted in a failed compaction:

TSM compaction (start)
Beginning compaction
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000021-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000022-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000023-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000024-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000025-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000026-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000027-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000028-000000001.tsm
Error replacing new TSM files       cannot allocate memory
TSM compaction (end)

This in turn caused temporary TSM files to be orphaned. Subsequent compactions for this group failed due to the orphaned .tsm.tmp files:

TSM compaction (start)
Beginning compaction
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000021-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000022-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000023-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000024-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000025-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000026-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000027-000000001.tsm
Compacting file /var/lib/influxdb/data/collectd/autogen/49/000000028-000000001.tsm
Aborted compaction      compaction in progress: open /var/lib/influxdb/data/collectd/autogen/49/000000028-000000002.tsm.tmp: file exists
TSM compaction (end)

This issue is filed as #14058.

Low memory issues

Fixing #14058 will not address the problems that occur when additional TSM data cannot be mmaped during a low-memory state.

Due to the low memory condition, snapshots eventually began to fail, resulting in the same mmap error cannot allocate memory. This causes a build up of .wal files and for the in-memory cache to continue to grow. Eventually, the server panicked due to no available memory after the cache grew too large. On restart, the server continues to crash as it is not able to allocate sufficient memory to load the existing .tsm files and build the cache using the large number of existing .wal files.

Notes collected during analysis

FileStore.replace enumerates the new files, removing the .tmp extension:

https://github.com/influxdata/influxdb/blob/e9bada090f10ea04f07ab56a839690ab15328269/tsdb/engine/tsm1/file_store.go#L738

Creates a new TSMReader:

https://github.com/influxdata/influxdb/blob/e9bada090f10ea04f07ab56a839690ab15328269/tsdb/engine/tsm1/file_store.go#L763

which attempts to mmap the file:

https://github.com/influxdata/influxdb/blob/05e7def600e85394f4826081f0f33f90b6ea0f3f/tsdb/engine/tsm1/reader.go#L1334

mmap fails with ENOMEM and returns cannot allocate memory. FileStore.replace handles the error:

https://github.com/influxdata/influxdb/blob/e9bada090f10ea04f07ab56a839690ab15328269/tsdb/engine/tsm1/file_store.go#L765-L770

and renames the file back to .tsm.tmp. The error is returned to the callee, ultimately resulting in the Error replacing new TSM files:

https://github.com/influxdata/influxdb/blob/aa3dfc066260674699ef8b01a4499bdcb425537a/tsdb/engine/tsm1/engine.go#L2210

The .tsm.tmp files are not cleaned up, which only happens in Compactor.writeNewFiles:

https://github.com/influxdata/influxdb/blob/2dd913d71bd015c9039a767176152f8ca959ab38/tsdb/engine/tsm1/compact.go#L1052-L1060

jjakob commented 4 years ago

I had the same issue with InfluxDB on a Raspberry Pi, it was crashing at startup even before starting the compaction. Setting swap via dphys-swapfile to 2GB had no effect. I had reservations on converting from TSM to TSI as there are some other issues open reporting that TSI uses more memory.

The fix was to copy /var/lib/influxdb to a 64-bit Debian Buster based system and run InfluxDB there. This loaded the files and started the compaction immediately which took about 5 minutes to complete as there was a ton of uncompacted files. Memory usage spiked to about 3.8G resident during the initial startup. Subsequent startups after compaction used about 217M resident.

Copying the database back to the Pi resulted in a successful startup of InfluxDB with it using only 163M resident.

So 64-bit systems will use considerably more RAM during normal operation (217M vs 163M) so a 64-bit build of Raspbian may not me the best choice. It definitely wouldn't have helped in my case as the initial startup took 3.8G, the Pi only has 1G RAM, even a 2G swap file may not have been enough.

A long term solution would be to start the compaction way earlier so we don't end up with so many uncompacted files. Perhaps this can be tuned via compaction settings.

vogler commented 4 years ago

Sorry, getting long. Can someone give a summary what the problem/status is? My RPi3's influxdb just started cycling "Aborted compaction".

$ sudo du -h -d1 /var/lib/influxdb
1.4G    /var/lib/influxdb/data
8.0K    /var/lib/influxdb/meta
1.8M    /var/lib/influxdb/wal
1.4G    /var/lib/influxdb
$ sudo find /var/lib/influxdb/ -iname '*.tmp'
/var/lib/influxdb/data/telegraf/autogen/60/000000016-000000002.tsm.tmp
/var/lib/influxdb/data/telegraf/autogen/62/000000014-000000002.tsm.tmp

Moving those .tmp files away and restarting doesn't help. I would like to keep the data - what should I do? Looks like ~600MB free during influxdb start, how come it "cannot allocate memory"? Can't it just compact less (at once?)?

jjakob commented 4 years ago

@vogler I'd first make sure you have as much free memory as possible - stop all other services, reboot if possible (will clear possible memory fragmentation) My Influx used up all available memory on my Pi (~860MiB) all on its own and still ran out, and this was in the startup phase, even before compaction. The only fix was to copy it to a more powerful 64-bit machine (can be non-ARM, just install the same version Influx from the repos as was on the failing machine), have it compact, then copy it back again. There was no data lost and no other errors. I don't know whether this will keep reoccurring. Possibly try setting max-concurrent-compactions = 1.

My errors were "Failed to open shard" log_id=0EqCYeWl000 service=store trace_id=0EqCYf1G000 op_name=tsdb_open db_shard_id=88 error="[shard 88] error opening memory map for file /var/lib/influxdb/data/telegraf/autogen/88/000000007-000000002.tsm: cannot allocate memory"

aemondis commented 4 years ago

@vogler I second @jjakob on the move to another server. It's the only way to recover the data. Until the InfluxDB team addresses the way compaction operates on address-space limited devices (e.g. 32-bit OS and restrictive RAM of the RPi (even with the 4GB Pi 4, I have the same issue!)) - you have no alternative short of using another time series DB platform. I tried max-concurrent-compactions = 1, but in my case at least it still fails. I just gave up on the compaction process entirely on the Pi and just rely on occasionally shipping everything to a VM on my main PC.

I have since recovered my influx DB multiple times now using the method of transferring to a PC VM. I simply stop the services, tar the files, scp them over, start the services, within about 2 minutes the files are compacted... so I stop the services, ship back, done. The lack of solution for this issue suggests it may be easier to simply script the above solution of log shipping between hosts.

akors commented 4 years ago

Question for the people who moved away from InfluxDB as a result of this issue: which database do you use instead?

By the way, my current "workaround" is to keep my database size very very small. The main offender for DB size was collectd. I cleaned out the store, created retention policies and continuous queries for data downsampling and now the collectd DB currently sits at around 60 megabytes.

This will probably work just fine for me, but is obviously not a solution if you need high-volume, high-resolution data.

pinkynrg commented 4 years ago

@ITGuyDave, do you really have the same issue with RPI4? Is it with a 32 or 64 bit OS?

jjakob commented 4 years ago

My above mentioned "fix" only lasted 3.5 days until influxdb started crashing again. Then it was offline for 4 days until somehow coming back on its own, I have no idea how. I didn't check on it until now so have no logs older than when the DB came back again, it may have OOMed the Pi so hard it rebooted or something. The logs show it crashed at least 5 times before successfully starting and compacting, it's been running fine for 2 days since.

Sep 11 21:20:41 hapi influxd[14877]: ts=2019-09-11T19:20:41.963474Z lvl=info msg="Reading file" log_id=0HpS1GbW000 engine=tsm1 service=cacheloader path=/var/lib/influxdb/wal/_internal/monitor/527/_00831.
Sep 11 21:20:41 hapi influxd[14877]: runtime: out of memory: cannot allocate 1622016-byte block (565379072 in use)
Sep 11 21:20:41 hapi influxd[14877]: fatal error: out of memory

@akors

By the way, my current "workaround" is to keep my database size very very small. The main offender for DB size was collectd. I cleaned out the store, created retention policies and continuous queries for data downsampling and now the collectd DB currently sits at around 60 megabytes.

This will probably work just fine for me, but is obviously not a solution if you need high-volume, high-resolution data.

That's interesting. I think my culprit is telegraf's system metrics, which log a similar amount of data than collectd (every 10s: cpu, load avg, memory, processes, ctx switches, forks, swap usage, disk i/o). My main use case for influx is to log metrics from ebusd via telegraf, which is ~2 measurements/sec max (20/10s), a lot less than telegraf's system metrics.

Can you detail on the continuous queries you created for data downsampling? I wouldn't want to downsample, but it's fine for system metrics, which isn't so important. Maybe a way to lessen the compaction intervals is possible, so each compaction has less uncompacted data to load, but I don't know how or have time to research it, it would be highly appreciated if someone did and shared their findings.

aemondis commented 4 years ago

@ITGuyDave, do you really have the same issue with RPI4? Is it with a 32 or 64 bit OS?

Yes - I am still using Raspbian on it (a 32-bit OS), so the same upper memory limit issue occurs after some time. I've since retasked the RPI4 for other duties, so haven't played around with it much more since, but the RPI3 is still running the InfluxDB.

I however am absolutely certain that if I was running a 64-bit OS, this compaction issue would not occur. It would however over time suffer severe performance degradation during compaction if the memory usage exceeded the 4GB physical and started paging, but it would still succeed (eventually). I have yet to hear of a stable and supported 64-bit RPi OS in any case. There are several out there, but many lose key functionality of the Raspberry Pi, such as GPIO support and require a lot of customisation to get going properly.

That's interesting. I think my culprit is telegraf's system metrics, which log a similar amount of data than collectd (every 10s: cpu, load avg, memory, processes, ctx switches, forks, swap usage, disk i/o). My main use case for influx is to log metrics from ebusd via telegraf, which is ~2 measurements/sec max (20/10s), a lot less than telegraf's system metrics.

Interestingly... my use case for InfluxDB is logging both the telegraf system metrics and received messages via Mosquitto MQTT. All in all, I'm peaking something like 23 metrics/sec when all my MPUs are in full swing - although it does jump around a bit, since some of the sensors can only poll every ~3 seconds, whereas others are polling >4 times per second. The nature of my logging is that there are tens of different metrics, but I am also tagging them by device and sensor. Maybe that partitioning has something to do with it? I'm not sure how behind the scenes InfluxDB treats data that is tagged like this, if it does anything different at all...

Currently my data directory is 1.9GB. The wal directory is at 405MB across 11000 files (and growing rapidly). I'm already suffering the dreaded compaction issues the same day after running the last compaction, so it's just a matter of time before it dies again...

akors commented 4 years ago

Can you detail on the continuous queries you created for data downsampling?

# Create retention policies: retain data for a week, a month and 6 months
CREATE RETENTION POLICY one_week" ON collectd DURATION 1w REPLICATION 1 DEFAULT;
CREATE RETENTION POLICY one_month ON collectd  DURATION 30d REPLICATION 1;
CREATE RETENTION POLICY six_months ON collectd  DURATION 182d REPLICATION 1;

# Create continuous queries: downsample to 1 minute for one month, downsample to 10 minutes for 6 months.
CREATE CONTINUOUS QUERY cq_1m_for_one_month ON collectd BEGIN SELECT mean(*) INTO collectd.one_month.:MEASUREMENT FROM collectd.one_week./.*/ GROUP BY time(1m), * END
CREATE CONTINUOUS QUERY cq_10m_for_six_months ON "collectd" BEGIN SELECT mean(*) INTO "collectd"."six_months".:MEASUREMENT FROM "collectd"."one_month"./.*/ GROUP BY time(10m),* END

Note that this will create "mean_value" and "mean_mean_value" fields in the one_month and six_months retention policy respectively, due to issue #7332 .

aemondis commented 4 years ago

For anyone still battling with this issue... Raspbian now has an experimental 64-bit kernel available. I have seen successful compaction on my RPI 4 (4 GB RAM) since switching to that kernel. Technically the 64-bit kernel works on the 3 series too, but I would probably suggest upgrading to a RPI 4 for the extra memory, as it's more likely to sustain larger databases in the long run.

Info on the 64-bit kernel is here: https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=250730

jjakob commented 4 years ago

@ITGuyDave I suspect that the higher amount of RAM (4 vs 1) is the key factor, not the 64-bit kernel, as I've detailed the memory usage in my previous post. During the compaction influxd used ~3.8G RAM on an amd64 OS, so while this isn't directly comparable to ARM64, it's indicative. If someone wants to do testing with a 64-bit kernel and OS build on RPi3 we'll know for sure, but I doubt it'll improve anything, I suspect it'll make it worse.

aemondis commented 4 years ago

I haven't migrated the InfluxDB off to the RPI4 yet as I haven't yet got a suitable power setup for it yet to run stable - so I'm still battling away with the RPI3 and the InfluxDB compaction errors that typically occur every 12 days.

During the compaction influxd used ~3.8G RAM on an amd64 OS

When I ran the compaction for my database is peaked at 1.8GB on a 64-bit OS. On the 32-bit Raspbian OS on my RPI4 w/ 4GB RAM it failed 100% of the time and restarted the service in a loop endlessly. On a 64-bit kernel (on the RPI4) - it peaked at 2.1GB this time but succeeded and has been stable since. Interestingly though - the Raspbian 64-bit kernel is JUST the kernel. It still runs a 32-bit userland - but I have no issues whatsoever with compaction on the 64-bit kernel but 100% failure rate for the 32-bit.

If someone wants to do testing with a 64-bit kernel and OS build on RPi3 we'll know for sure, but I doubt it'll improve anything, I suspect it'll make it worse.

If I get the chance I will play about with 64-bit on RPI3, but I suspect it will make no difference due to the lack of RAM in the first place - it will just page to SD card. There are some things according to the devs of Raspbian that will run faster under 64-bit generally, but mostly the big improvements are to be had on the RPI4 due to things like USB3 and better graphics being available. There's some functionality that breaks with 64-bit Raspbian too that are unique RPI features (e.g. certain drivers and libraries for things like Kodi) but otherwise it's been rock solid for me.

@ITGuyDave, do you really have the same issue with RPI4? Is it with a 32 or 64 bit OS?

As above - I've had 100% failure rate with InfluxDB compaction on 32-bit Raspbian, with 100% success with the 64-bit Raspbian kernel enabled. I wouldn't advise running the 64-bit kernel on anything less than an RPI4 with 4 GB RAM though, as it would make very little difference and probably just page to SD card rather than providing a meaningful improvement.

Long story short... InfluxDB needs to sort out the way they use memory. The DB draws a hell of a lot of memory during general use with no load, and cannot handle compaction without chewing ridiculous amounts of memory. I'm playing with TimescaleDB as an alternative (Postgres-based) and the memory usage is negligible despite the query performance being comparable. Queries are a bit clunkier, but if the InfluxDB issues continue I'll probably move over to TimescaleDB. InfluxDB is just too high maintenance and the devs seem to have no interest in improving the fundamental issues with memory management with this DB. I cannot believe there is no way to set an upper limit even!

vogler commented 4 years ago

Just copied the data, let compaction run on macOS and copied it back. Noted the steps here, maybe it's helpful for someone: https://github.com/vogler/smart-home/blob/master/influxdb-fail.md Strangely, after this procedure it's missing data of the last 3 days, while Chronograf displayed data for that time before InfluxDB started crashing today on the RPi. The log on macOS doesn't mention any problems.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

somera commented 4 years ago

On My side the problem still exist.

-rw-r--r-- 1 influxdb influxdb 110054811 Jan 21 20:26 /var/lib/influxdb/data/telegraf/autogen/165/000000097-000000002.tsm.tmp
-rw-r--r-- 1 influxdb influxdb 94412901 Jan 22 00:37 /var/lib/influxdb/data/telegraf/autogen/102/000000142-000000002.tsm.tmp

I get this problem every 2-3 days.

Connected to http://localhost:8086 version 1.7.9
InfluxDB shell version: 1.7.9
> SHOW RETENTION POLICIES ON telegraf
name    duration  shardGroupDuration replicaN default
----    --------  ------------------ -------- -------
autogen 1344h0m0s 168h0m0s           1        true
> SHOW RETENTION POLICIES ON collectd
name    duration  shardGroupDuration replicaN default
----    --------  ------------------ -------- -------
autogen 1344h0m0s 168h0m0s           1        true
unmeninfot commented 4 years ago

For anyone still battling with this issue... Raspbian now has an experimental 64-bit kernel available. I have seen successful compaction on my RPI 4 (4 GB RAM) since switching to that kernel. Technically the 64-bit kernel works on the 3 series too, but I would probably suggest upgrading to a RPI 4 for the extra memory, as it's more likely to sustain larger databases in the long run.

Info on the 64-bit kernel is here: https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=250730

Thanks a lot. It worked for me. Influx, Grafana, node-red have been working in a RPI4 (4gb) for 6 moths without problem. Then Influxdb entered an endless crash loop. In the logs: "out of memory" errors. In the task manager you can see that the virtual memory that influx was using grew really fast up to 3Gb then crashed. No access to the influxdb prompt. Then, as recomended, I switched to the 64bits kernel. The memory it uses is still a lot, 3.3 gb, but stable and I have access to the prompt. Everything looks like it's working as before.

CJKohler commented 4 years ago

I'm having the same issue, where after I while it gets into a loop, trying to compact the tsm files, and fails with "cannot allocate memory". I have a Raspberry Pi 4, with 4GB of ram, and a fast SSD as the root partition. I'm running the 64 bit kernel: Linux cjkohler-pi20 5.4.35-v8+ #1314 SMP PREEMPT Fri May 1 17:54:25 BST 2020 aarch64 GNU/Linux I'm doing some load testing and writing 5000 points/sec to a new database. It works for a couple of hours, but then it reports the "cannot allocate memory" error. I see it every minute in the log after that. About 2 hours later the posting of new values stops with a HTTP connection refused. The problem is that after that, it will never recover.

free -h reports: 3.2 Gigabyte available. The .tsm.tmp files grow to about 250 Mb and then go back to 0 and grow again, never finishing. I see this message repeated over and over

> May 06 10:20:25 cjkohler-pi20 influxd[2170]: ts=2020-05-06T17:20:25.287484Z lvl=info msg="TSM compaction (start)" log_id=0MbLidsW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0MbktSml000 op_name=tsm1_compact_group op_event=start
> May 06 10:20:25 cjkohler-pi20 influxd[2170]: ts=2020-05-06T17:20:25.287553Z lvl=info msg="Beginning compaction" log_id=0MbLidsW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0MbktSml000 op_name=tsm1_compact_group tsm1_files_n=4
> May 06 10:20:25 cjkohler-pi20 influxd[2170]: ts=2020-05-06T17:20:25.287577Z lvl=info msg="Compacting file" log_id=0MbLidsW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0MbktSml000 op_name=tsm1_compact_group tsm1_index=0 tsm1_file=/var/lib/influxdb/data/writetest/autogen/28/000000136-000000002.tsm
> May 06 10:20:25 cjkohler-pi20 influxd[2170]: ts=2020-05-06T17:20:25.287602Z lvl=info msg="Compacting file" log_id=0MbLidsW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0MbktSml000 op_name=tsm1_compact_group tsm1_index=1 tsm1_file=/var/lib/influxdb/data/writetest/autogen/28/000000144-000000002.tsm
> May 06 10:20:25 cjkohler-pi20 influxd[2170]: ts=2020-05-06T17:20:25.287624Z lvl=info msg="Compacting file" log_id=0MbLidsW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0MbktSml000 op_name=tsm1_compact_group tsm1_index=2 tsm1_file=/var/lib/influxdb/data/writetest/autogen/28/000000152-000000002.tsm
> May 06 10:20:25 cjkohler-pi20 influxd[2170]: ts=2020-05-06T17:20:25.287645Z lvl=info msg="Compacting file" log_id=0MbLidsW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0MbktSml000 op_name=tsm1_compact_group tsm1_index=3 tsm1_file=/var/lib/influxdb/data/writetest/autogen/28/000000160-000000002.tsm
> May 06 10:21:12 cjkohler-pi20 influxd[2170]: ts=2020-05-06T17:21:12.243678Z lvl=info msg="Error replacing new TSM files" log_id=0MbLidsW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0MbktSml000 op_name=tsm1_compact_group error="cannot allocate memory"
> May 06 10:21:13 cjkohler-pi20 influxd[2170]: ts=2020-05-06T17:21:13.244117Z lvl=info msg="TSM compaction (end)" log_id=0MbLidsW000 engine=tsm1 tsm1_level=2 tsm1_strategy=level trace_id=0MbktSml000 op_name=tsm1_compact_group op_event=end op_elapsed=47956.602ms

I don't need 5000 points/sec, but I'm trying to see how fast I can send points, but once it gets into this compression failure loop, I have to drop the database to get the CPU load down.

Any suggestions?

jjakob commented 4 years ago

Read my above comments - copy the DB files to a different machine, install influx on it, start it and observe the logs for compaction, observe memory usage with top, once done, stop it and copy back to the origin machine and start it back up. Find a maximum number of points/sec that doesn't use all your RAM when compacting.

CJKohler commented 4 years ago

thanks @jjakob, I did see that comment. At this point I'm fine to drop the database and start again, so I don't think I need to go to a different machine at this point.

I don't understand why it is not able to compress. When i run the "free -h" command I see 3.2 GB available. The file it is trying to compress is 225 MB. Here is my memory over time. I started the test around 10pm, and the logs show successful compression of the files until about 2:45am, at which point the posting of new data failed. I know from https://www.linuxatemyram.com/ that it is deceptive to only look at free memory because of the caching. memory

I'm trying to see what gives me a good indication that a machine is keeping up. I looked at the .wal files, and those were not accumulating, suggesting that the database was able to keep up

aemondis commented 4 years ago

@CJKohler it's inevitable with larger databases that you will eventually run into issues with InfluxDB on the RPI, even the 4GB Pi 4 with the 64 bit kernel (I have this same configuration). I have this same issue currently with compaction failing, and there's little you can do to resolve the situation as it is simply due to how InfluxDB manages memory (very poorly).

In essence, the developers have no interest in supporting the IoT market with InfluxDB on lightweight hardware - the platform simply assumes an infinite amount of memory and CPU is available and offers no customisable limits to limit how resources are used. You can try messing around with retention periods and other variables, but I've been trying that for months and still have to routinely transfer the database to a desktop PC simply to allow a compaction, otherwise the DB eventually crashes due to the size. My use case requires fine-grained data logging which unfortunately means a lot of data in each file, so I'm likely to move over to an Intel NUC or equivalent to handle InfluxDB.

It's a real pity the developers cannot see the opportunities for configurations on lightweight hardware like this are inevitably going to be the future, especially with distributed IoT and sensor networks.

CJKohler commented 4 years ago

Thanks @ITGuyDave for confirming that this is an inherent problem, and there is not anything that I can fix. I think I'll look into TimescaleDB combined with Grafana.

I created a 32 Gb swap using dphys-swapfile, which might be slow, but should prevent out of memory errors, but it doesn't help. It gets used a little bit, on the order of 100 Mb, but doesn't prevent compacting problem.

I tried index-version = "tsi1" which is supposed to use disk-based indices as opposed to in-memory, but no luck there.

aemondis commented 4 years ago

@CJKohler I also tried TimescaleDB, and it does indeed work but the RPI is a little too underpowered to make queries performant, as postgres is a relational DB thus requires very careful queries with appropriate indexing (you need to be very careful how you write data into TimescaleDB as a result, as this impacts the placement in the indexes and partitions). I also use Grafana, so it sounds like you have a similar configuration to me; although I run Grafana on a separate RPi 3, to allow the RPi4 to dedicate itself to just InfluxDB.

The biggest problem with the 64-bit kernel with RPi is that it's only a 64-bit kernel - but all the userspace is still 32-bit. I suspect this may be part of the issue with compaction.

There is one potential solution I haven't yet had the time to try... that is running a balenaOS container on the RPi 4, as they have a native 64-bit image that is not produced by the RPi foundation. As BalenaOS is very lightweight container host infrastructure and can use the full memory of the host, you could then try the InfluxDB and Grafana Docker containers and it should have full native 64-bit support. Just be sure to use suitable ARM64 architecture images.

If you have the time to give it a try, let us know if this turns out to be a solution! :)

CJKohler commented 4 years ago

Thanks @ITGuyDave BalenaOS sounds like an interesting approach. I have run docker in the past on my RPi3. Getting the SSD to be used in BalenaOS is another tricky issue. From a quick read it sounds like you can mount the SSD to a container, but not the host OS. And only one container can mount it. You might be able to share it with other containers through SMB, but that seems a bit convoluted. I'll see if I can give it a try.

CJKohler commented 4 years ago

After trying BalenaOS for about a week, I gave up. It does provide true 64-bit support on a Pi 4, but trying to use a SSD to store the InfluxDB on was so painful that I gave up.

But the 64-bit unofficial Ubuntu 18.04 build for Raspberry Pi from James Chambers works great. And you only have to make 1 change in cmdline.txt to run it from an SSD. It can use the full 4 Gb on a Pi 4. https://jamesachambers.com/raspberry-pi-4-xubuntu-18-04-image-released-unofficial/

I have been running my torture test that sends 5000 points per second to the database with a very high cardinality (144000) and it is working great. I have sent well over 1 billion points, and the database is now 33 Gb, it has compacted it over 650 times, without problems.

aemondis commented 4 years ago

@CJKohler that's some amazing work there! Balena wouldn't allow a USB SSD storage though? That seems like a major limitation on any RPi...

Either way, I wasn't aware of Ubuntu having a workable build for RPi now - I've always preferred Ubuntu for my Linux builds, so I might just have to jam in the time to make it happen! My InfluxDB is unable to compact yet again, so it's just a matter of time before my hand is forced yet again to rebuild the RPi with some other approach. On my current setup, I struggle to even get 300 points per second into it from my sensor network, mainly due to issues with Telegraf accepting JSON formatted messages and hitting internal buffer limits on writing into InfluxDB. Time for a change I think...

Thanks for the ongoing feedback, once I have a chance I'll try the Ubuntu build too and see if it solves the issues with InfluxDB on RPi for me too (and thus helps anyone else who comes across this thread).

aemondis commented 4 years ago

After trying the unofficial Ubuntu release... every issue I had with InfluxDB seems to have gone. Despite not actually changing any settings, the memory usage is minimal, the CPU load is nil under standard load, and the upper limit of pps being sent to the DB is extremely high (I'm seeing about 4300 pps with an ancient Intel 74 GB MLC SSD). I have a 53GB database running currently, and no issues to speak of with compacting any more. I hadn't intended on doing this test today, but shortly after I replied the earlier comment, InfluxDB went into the dreaded endless service restart loop due to the failing compactions.

For anyone else who comes across this thread and wants a minature InfluxDB server that can actually handle a moderately sized DB... you won't get a reliable InfluxDB without a fully 64-bit environment, as InfluxDB does not support 32-bit very well. Just do away with Raspbian and Balena and go to the Ubuntu server image as mentioned by @CJKohler. It seems much more responsive, is running significantly faster and the RPi4 is running almost cool to the touch for the first time, with InfluxDB running faster than it ever has!

JanHBade commented 4 years ago

64Bit Rasperry OS is coming: https://www.raspberrypi.org/forums/viewtopic.php?f=117&t=275370

I orderd a 8GB Pi and will test the setup....

unreal4u commented 3 years ago

After trying the unofficial Ubuntu release... every issue I had with InfluxDB seems to have gone. Despite not actually changing any settings, the memory usage is minimal, the CPU load is nil under standard load, and the upper limit of pps being sent to the DB is extremely high (I'm seeing about 4300 pps with an ancient Intel 74 GB MLC SSD). I have a 53GB database running currently, and no issues to speak of with compacting any more. I hadn't intended on doing this test today, but shortly after I replied the earlier comment, InfluxDB went into the dreaded endless service restart loop due to the failing compactions.

For anyone else who comes across this thread and wants a minature InfluxDB server that can actually handle a moderately sized DB... you won't get a reliable InfluxDB without a fully 64-bit environment, as InfluxDB does not support 32-bit very well. Just do away with Raspbian and Balena and go to the Ubuntu server image as mentioned by @CJKohler. It seems much more responsive, is running significantly faster and the RPi4 is running almost cool to the touch for the first time, with InfluxDB running faster than it ever has!

Hi @ITGuyDave ! I'm seeing the exact same problem here with my pi4 4GB + having some others as well (Xorg crashing randomly and needed to restart networking after each boot because the pi will lose connectivity, more info here: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=277231 ).

Can you confirm that the GPIO ports are working with Ubuntu?

Greetings.

aemondis commented 3 years ago

@unreal4u I personally don't use the RPi for GPIO, as I predominantly use mine for acting as mini low-power servers and have Arduino-based MCUs sending data to them over the network, but according to the maintainer of the Ubuntu image (see https://jamesachambers.com/raspberry-pi-4-ubuntu-server-desktop-18-04-3-image-unofficial/), the standard Raspbian kernel and utilities are available so I see no reason they shouldn't work.

For InfluxDB also... be sure to set the index to file-based rather than in-memory, since a growing DB will inevitably have issues with memory caching at some point once there is enough data. You might also need to play around a bit with the retention period and shard groups to better tune how the database manages the underlying shards. I played around a lot with mine over countless hours (I'm still not happy with the performance, but it's 90% better immediately simply by going Ubuntu). My server is logging about 47 parameters every 5 seconds into InfluxDB with no issues now, via Mosquitto MQTT into Telegraf and the CPU is almost idle, with very low memory usage now. Just be sure you have a decent MLC-based SSD attached to get the most out of it - avoid the SD card wherever you can, and be sure to move all frequently-accessed log files to the SSD rather than SD to avoid the card dying from excessive writes. I run an ancient Intel X25-M 74GB SSD for my InfluxDB via USB3 and it runs brilliantly in that setup, considering the very low power needs of that setup.

Let us know your experience with GPIO?

unreal4u commented 3 years ago

Thanks @ITGuyDave !

I installed Ubuntu Server during the weekend and finally came around last night to play around with the GPIO ports. And yes, I can confirm they do work without problems!

I haven't played a lot yet with Influxdb but the compactation process did work without issues and the avg. load has come down from a permanent 2.x to <1.0 (not bad considering I run 16+ docker images AND use the GUI as well to display a magic mirror, all while recollecting data through USB ports + GPIO). The responsiveness of Grafana has increased considerably as well, mainly due to Influxdb's faster response time. I do run Influxdb through Docker, but I'll def. take a look at tuning it, I do however not import that much data (yet :) )

I was already using an SSD, the only quirk is that I had to go back to using a microsd card for /boot and I had to limit the memory amount in order to let the rpi recognize the USB ports, but after that point, it mounts the SSD and I can take advantage of the full 4GB of RAM. More info on that here: https://www.cnx-software.com/2019/11/04/raspberry-pi-4-4gb-models-usb-ports-dont-work-on-ubuntu-19-10/ (The post cites 19.10, but the same applies for 20.04 LTS which is what I'm using).

All in all, I'm quite happy so far, the only thing I miss is vcgencmd, mainly because the command to turn the screen on and off through CLI was super simple and I didn't have to fiddle with xrandr as much, but that is solved now as well :)

Thanks!

vogler commented 3 years ago

the only quirk is that I had to go back to using a microsd card for /boot

I don't know about Ubuntu, but on Raspbian this is no longer needed after some update. My RPi4 is running solely from SSD.

unreal4u commented 3 years ago

I don't know about Ubuntu, but on Raspbian this is no longer needed after some update. My RPi4 is running solely from SSD.

It does not seem possible yet. I had that same setup however with Raspberry Pi OS, but my USB ports were nog being recognized at boot so it has to go through the SD card first. Not a big issue, I had the same setup before it was possible to boot directly from USB.