tmpfs: 4G but No space left on device

XioNoX commented 1 year ago

Hi,

We're running Routinator 0.11.3 with a 4G tmpfs for the RPKI cache.

Filesystem      Size  Used Avail Use% Mounted on
tmpfs           4.0G  2.3G  1.8G  56% /var/lib/routinator/repository

Unfortunately, our Routinator systemd service started to alert as it was stuck in a crash loop. The daemon might have been restarted by our automation once (eg. to pick up new library upgrades). And then the following kept happening:

Sep 11 08:27:55 rpki2002 routinator[1667211]: Fatal: failed to open file /var/lib/routinator/repository/rrdp/rrdp.ripe.net/21d6592469dbe79feb2922562764fd193170f173229298b9a4443ffb5c282000/tmp/rpki.ripe.net/repository/DEFAULT/ws5Z0a-DDJS6jqEc-P7G3ZwDuRc.cer: No space left on device (os error 28)
10:28 
Sep 11 08:27:56 rpki2002 routinator[1667211]: Fatal error. Exiting.

I manually stopped the deamon and cleaned the tmpfs filesystem then restarted Routinator.

Filesystem      Size  Used Avail Use% Mounted on
tmpfs           4.0G  2.2G  1.9G  55% /var/lib/routinator/repository

It seems to be stable since then.

With the available space on tmpfs I'm a bit surprised of the error No space left on device. One hypothesis is that Routinator downloads data first before removing old data during initialization, requiring briefly the double of space/memory.

partim commented 1 year ago

The issue very likely comes from running out of inodes, which unfortunately results in a rather misleading ‘No space left on device’ error. (We manually adjusted the log message in 0.12. to read ’No space or inodes left on device’.)

Without having tested this, I think you also want to add -o nr_inodes=2M when mounting the tmpfs. We’ll add this to the incantation mentioned in the manual once we did test.

lukastribus commented 1 year ago

(To check inode numbers use df -i)

AlexanderBand commented 1 year ago

I'm in the process of documenting this in #890. Still need to test.

XioNoX commented 1 year ago

FYI we bumped the number of inodes in our Puppet code as suggested and it did the right thing without disrupting Routinator 3000. Thanks!

$ df -i
Filesystem      Inodes  IUsed   IFree IUse% Mounted on
tmpfs          2097152 653093 1444059   32% /var/lib/routinator/repository
$ df -h
Filesystem      Size  Used Avail Use% Mounted on
tmpfs           4.0G  2.3G  1.8G  56% /var/lib/routinator/repository

NLnetLabs / routinator

tmpfs: 4G but No space left on device #889