High i/o on farmer start with existing shards

BlackDuck888 commented 8 years ago

Package Versions

Replace the values below using the output from npm list storj.

Farmer: v0.4.3
Core:   v0.6.16

Replace the values below using the output from node --version.

v4.4.3

Expected Behavior

I expect that the system is responsive after a short periode after the farmer starts.

Actual Behavior

It looks like, thats the full farmer.db is reread/rewritten after the farmer start/restart. On my system this process takes up to 20 Minutes @ ~44GB Shards. For dedicated servers this problen is secundary, but for users who run Windows and have to use there Computer for "normal" things this could be a Problem. If there 500GB Shards on disk, and a reboot is caused/forced by Windows Update, the system will need 3h to be useable.

I attach the dstat.csv.txt, in this file you see the start and the high i/o over time.

An a Screenshot from Diskusage.

diskusage_state

duck@vlan2-farmer:/$ date;pm2 restart all
Do 26. Mai 09:11:48 CEST 2016
[PM2] Applying action restartProcessId on app [all](ids: 0,1,10,20,30)
[PM2] [17EAA_1](0) ?
[PM2] [17EAA_2](1) ?
[PM2] [1AKRj_1](10) ?
[PM2] [1Gtpb_1](20) ?
[PM2] [1Mh32_1](30) ?
+-----------------------------------------------------------------------------------+
¦ App name ¦ id ¦ mode ¦ pid   ¦ status ¦ restart ¦ uptime ¦ memory      ¦ watching ¦
+----------+----+------+-------+--------+---------+--------+-------------+----------¦
¦ 17EAA_1  ¦ 0  ¦ fork ¦ 25863 ¦ online ¦ 1       ¦ 0s     ¦ 29.258 MB   ¦ disabled ¦
¦ 17EAA_2  ¦ 1  ¦ fork ¦ 25868 ¦ online ¦ 1       ¦ 0s     ¦ 25.469 MB   ¦ disabled ¦
¦ 1AKRj_1  ¦ 10 ¦ fork ¦ 25873 ¦ online ¦ 1       ¦ 0s     ¦ 28.008 MB   ¦ disabled ¦
¦ 1Gtpb_1  ¦ 20 ¦ fork ¦ 25878 ¦ online ¦ 2       ¦ 0s     ¦ 33.805 MB   ¦ disabled ¦
¦ 1Mh32_1  ¦ 30 ¦ fork ¦ 25899 ¦ online ¦ 1       ¦ 0s     ¦ 17.184 MB   ¦ disabled ¦
+-----------------------------------------------------------------------------------+

top - 09:13:31 up 19:42,  2 users,  load average: 4,34, 1,70, 0,98
Tasks: 550 gesamt,   2 laufend, 548 schlafend,   0 gestoppt,   0 Zombie
%CPU(s): 27,4 be,  3,3 sy,  0,0 ni, 15,5 un, 53,7 wa,  0,0 hi,  0,1 si,  0,0 st
KiB Spch :  8175356 gesamt,   994004 frei,  4668628 belegt,  2512724 Puff/Cache
KiB Swap:  1044476 gesamt,  1016836 frei,    27640 belegt.  3049396 verfü Spch

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     ZEIT+ BEFEHL
25878 duck      20   0 2089024 153184  89536 S   1,0  1,9   0:02.53 node /usr/local
25863 duck      20   0 2854300 256532 176692 S   0,7  3,1   0:03.14 node /usr/local
25873 duck      20   0 2297024 190036 134892 S   0,7  2,3   0:02.52 node /usr/local

[pid 25909] open("/mnt/disk1/17EAA_1/farmer.db", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 18
[pid 25909] fstat(18, {st_mode=S_IFDIR|0755, st_size=7517, ...}) = 0
[pid 25909] getdents(18, /* 1024 entries */, 32768) = 32744
[pid 25909] getdents(18, /* 1024 entries */, 32768) = 32768
[pid 25909] getdents(18, /* 1024 entries */, 32768) = 32768
[pid 25909] getdents(18, /* 1023 entries */, 32768) = 32744
[pid 25909] getdents(18, /* 1024 entries */, 32768) = 32760
[pid 25909] getdents(18, /* 1024 entries */, 32768) = 32768
[pid 25909] getdents(18, /* 1024 entries */, 32768) = 32768
[pid 25909] getdents(18, /* 350 entries */, 32768) = 11200
[pid 25909] getdents(18, /* 0 entries */, 32768) = 0
[pid 25909] close(18)                   = 0
[pid 25909] munmap(0x7f7702ff5000, 330999) = 0
[pid 25909] write(12, "2016/05/26-09:15:19.393209 7f776"..., 62) = 62
[pid 25909] unlink("/mnt/disk1/17EAA_1/farmer.db/610262.ldb") = 0
[pid 25909] futex(0x31e0d64, FUTEX_WAIT_PRIVATE, 23, NULL <unfinished ...>
[pid 25890] open("/mnt/disk1/17EAA_1/farmer.db/569556.ldb", O_RDONLY) = 18
[pid 25890] stat("/mnt/disk1/17EAA_1/farmer.db/569556.ldb", {st_mode=S_IFREG|0664, st_size=2105873, ...}) = 0
[pid 25890] mmap(NULL, 2105873, PROT_READ, MAP_SHARED, 18, 0) = 0x7f7724de3000
[pid 25890] close(18)                   = 0
[pid 25890] open("/mnt/disk1/17EAA_1/farmer.db/569557.ldb", O_RDONLY) = 18
[pid 25890] stat("/mnt/disk1/17EAA_1/farmer.db/569557.ldb", {st_mode=S_IFREG|0664, st_size=2104675, ...}) = 0
[pid 25890] mmap(NULL, 2104675, PROT_READ, MAP_SHARED, 18, 0) = 0x7f773ddc7000
[pid 25890] close(18)                   = 0
[pid 25890] munmap(0x7f7728a33000, 2108476) = 0
[pid 25890] open("/mnt/disk1/17EAA_1/farmer.db/569558.ldb", O_RDONLY) = 18
[pid 25890] stat("/mnt/disk1/17EAA_1/farmer.db/569558.ldb", {st_mode=S_IFREG|0664, st_size=2104652, ...}) = 0
[pid 25890] mmap(NULL, 2104652, PROT_READ, MAP_SHARED, 18, 0) = 0x7f7728a34000
[pid 25890] close(18)                   = 0
[pid 25890] munmap(0x7f7719c90000, 2105555) = 0
[pid 25890] open("/mnt/disk1/17EAA_1/farmer.db/569559.ldb", O_RDONLY) = 18
[pid 25890] stat("/mnt/disk1/17EAA_1/farmer.db/569559.ldb", {st_mode=S_IFREG|0664, st_size=2104651, ...}) = 0
[pid 25890] mmap(NULL, 2104651, PROT_READ, MAP_SHARED, 18, 0) = 0x7f77263fe000
[pid 25890] close(18)                   = 0

duck@vlan2-farmer:~$ date;pm2 list
Do 26. Mai 09:16:14 CEST 2016
+------------------------------------------------------------------------------------+
¦ App name ¦ id ¦ mode ¦ pid   ¦ status ¦ restart ¦ uptime ¦ memory       ¦ watching ¦
+----------+----+------+-------+--------+---------+--------+--------------+----------¦
¦ 17EAA_1  ¦ 0  ¦ fork ¦ 25863 ¦ online ¦ 1       ¦ 4m     ¦ 329.820 MB   ¦ disabled ¦
¦ 17EAA_2  ¦ 1  ¦ fork ¦ 25868 ¦ online ¦ 1       ¦ 4m     ¦ 71.723 MB    ¦ disabled ¦
¦ 1AKRj_1  ¦ 10 ¦ fork ¦ 25873 ¦ online ¦ 1       ¦ 4m     ¦ 240.191 MB   ¦ disabled ¦
¦ 1Gtpb_1  ¦ 20 ¦ fork ¦ 25878 ¦ online ¦ 2       ¦ 4m     ¦ 234.816 MB   ¦ disabled ¦
¦ 1Mh32_1  ¦ 30 ¦ fork ¦ 25899 ¦ online ¦ 1       ¦ 4m     ¦ 283.184 MB   ¦ disabled ¦
+------------------------------------------------------------------------------------+

ncdu 1.11 ~ Use the arrow keys to navigate, press ? for help
--- /mnt/disk1 -------------------------------------------------------------
   14,2 GiB [##########] /17EAA_1
   14,2 GiB [######### ] /1Gtpb_1
   11,5 GiB [########  ] /1Mh32_1
    4,5 GiB [###       ] /1AKRj_1
   16,0 MiB [          ] /17EAA_2
    8,5 KiB [          ] /1AKRj_10
    8,5 KiB [          ] /1Gtpb_6
    8,5 KiB [          ] /1AKRj_8
    8,5 KiB [          ] /1AKRj_6
    8,5 KiB [          ] /1AKRj_3
    8,5 KiB [          ] /17EAA_10
    8,5 KiB [          ] /1Mh32_10
    8,5 KiB [          ] /1Gtpb_10
    8,5 KiB [          ] /1Mh32_9
 Total disk usage:  44,4 GiB  Apparent size:  44,5 GiB  Items: 24011

Steps to Reproduce

Please include the steps the reproduce the issue, numbered below. Include as much detail as possible.

You need a node with some GB of shards
Restart the farmer app
Watch dstat/top and monitor the load and wait (wait indicates that the system waits for disk i/o) values

super3 commented 8 years ago

+1 For a super detailed issue.

BlackDuck888 commented 8 years ago

Thank you, Skunk is a nice coach in this.

tacticalchihuahua commented 8 years ago

This is the shard reaper that runs on interval to remove incomplete or expired contracts and their associated shards.

It likely needs to be run as a separate process and have some throttling mechanism in place.

BlackDuck888 commented 8 years ago

I did not calculate it down to the smallest detail, but it looks like that all shards with its complete content are reread at a restart. The reaper should read the metadata and remove incomplete or expired, but a reread of full content is to much. After my test i only lost some mb most of the gb were obtained.

tacticalchihuahua commented 8 years ago

We must be reading though all keys, not just .info keys. Will issue a fix with 1.0.0.

tacticalchihuahua commented 8 years ago

Okay, the 1.0.0 PR now tries to mitigate this by doing two things:

Do not open the shard stream (doh!), but only the contract metadata (adds a peek method to the adapter used in place of get).
Increase the reaper interval from 1 hour to 3 hours, no need to reap that often and we might even stand to run increase the interval more.

storj-archived / core