XRPLF / rippled

Decentralized cryptocurrency blockchain daemon implementing the XRP Ledger protocol in C++
https://xrpl.org
ISC License
4.5k stars 1.46k forks source link

why high memory usage run by ripple daemon? #2374

Closed passionofvc closed 5 years ago

passionofvc commented 6 years ago

I run rippled with testnet for two week, everything is ok, but the memory is very in high usage. Does this is normal behavior for using memory . I set online_delete=2000, not other network error, and rpc response worked fine.

OS :centos7 mem:8G HDD:150G

grep -v '#' cfg/stg/rippled.cfg
[peers_max]
50

[server]
port_rpc_admin_local
port_peer
port_ws_admin_local
port_ws_public

[port_rpc_admin_local]
port = 5005
ip = 0.0.0.0
admin = 127.0.0.1
protocol = http

[port_peer]
port = 51235
ip = 0.0.0.0
protocol = peer

[port_ws_admin_local]
port = 6006
ip = 0.0.0.0
admin = 127.0.0.1
protocol = ws

[port_ws_public]
port = 5006
ip = 172.16.0.171
protocol = ws
[node_size]
medium

[ledger_history]
2000

[node_db]
type=RocksDB
path=/home/ripple/build/db_stg/rocksdb
open_files=2000
filter_bits=12
cache_mb=256
file_size_mb=8
file_size_mult=2
online_delete=2000
advisory_delete=0

[database_path]
/home/ripple/build/db_stg

[debug_logfile]
/home/ripple/build/debug_stg.log

[sntp_servers]
time.windows.com
time.apple.com
time.nist.gov
pool.ntp.org

[ips]
s.altnet.rippletest.net 51235

[validators_file]
validators.txt

[rpc_startup]
{ "command": "log_level", "severity": "info" }

[ssl_verify]
1
[ripple@XRP-RIPPLE-DEV build]$  /home/ripple/build/rippled  --conf=/home/ripple/build/cfg/stg/rippled.cfg get_counts
Loading: "/home/ripple/build/cfg/stg/rippled.cfg"
2018-Feb-07 11:31:48 HTTPClient:NFO Connecting to 127.0.0.1:5005

{
   "result" : {
      "AL_hit_rate" : 49.51347351074219,
      "HashRouterEntry" : 25995,
      "Ledger" : 67,
      "NodeObject" : 951,
      "RCLCxPeerPos::Data" : 90,
      "SLE_hit_rate" : 0.4926642255044113,
      "STArray" : 2977,
      "STLedgerEntry" : 104,
      "STObject" : 36371,
      "STTx" : 31158,
      "STValidation" : 1782,
      "Transaction" : 31102,
      "dbKBLedger" : 16816,
      "dbKBTotal" : 43196,
      "dbKBTransaction" : 25224,
      "fullbelow_size" : 0,
      "historical_perminute" : 0,
      "ledger_hit_rate" : 80.02935028076172,
      "node_hit_rate" : 25.43169593811035,
      "node_read_bytes" : 650805671,
      "node_reads_hit" : 63522846,
      "node_reads_total" : 105161147,
      "node_writes" : 23754234,
      "node_written_bytes" : 3623700897,
      "status" : "success",
      "treenode_cache_size" : 736,
      "treenode_track_size" : 71322,
      "uptime" : "16 days, 1 hour, 7 minutes, 40 seconds",
      "write_load" : 0
   }
}

[ripple@XRP-RIPPLE-DEV build]$  /home/ripple/build/rippled  --conf=/home/ripple/build/cfg/stg/rippled.cfg server_info
Loading: "/home/ripple/build/cfg/stg/rippled.cfg"
2018-Feb-07 11:32:47 HTTPClient:NFO Connecting to 127.0.0.1:5005

{
   "result" : {
      "info" : {
         "build_version" : "0.90.0-b4",
         "complete_ledgers" : "6507718-6509825",
         "hostid" : "XRP-RIPPLE-DEV",
         "io_latency_ms" : 1,
         "jq_trans_overflow" : "1276",
         "last_close" : {
            "converge_time_s" : 2,
            "proposers" : 9
         },
         "load" : {
            "job_types" : [
               {
                  "job_type" : "untrustedValidation",
                  "per_second" : 2
               },
               {
                  "job_type" : "untrustedProposal",
                  "per_second" : 25
               },
               {
                  "in_progress" : 1,
                  "job_type" : "clientCommand",
                  "per_second" : 1
               },
               {
                  "job_type" : "transaction",
                  "per_second" : 33
               },
               {
                  "job_type" : "batch",
                  "per_second" : 17
               },
               {
                  "job_type" : "fetchTxnData",
                  "per_second" : 1
               },
               {
                  "job_type" : "trustedValidation",
                  "per_second" : 1
               },
               {
                  "job_type" : "writeObjects",
                  "peak_time" : 4,
                  "per_second" : 2
               },
               {
                  "avg_time" : 3,
                  "job_type" : "acceptLedger",
                  "peak_time" : 7
               },
               {
                  "job_type" : "trustedProposal",
                  "per_second" : 2
               },
               {
                  "job_type" : "peerCommand",
                  "per_second" : 404
               },
               {
                  "job_type" : "diskAccess",
                  "peak_time" : 4,
                  "per_second" : 1
               },
               {
                  "job_type" : "processTransaction",
                  "per_second" : 33
               },
               {
                  "job_type" : "WriteNode",
                  "per_second" : 8
               }
            ],
            "threads" : 6
         },
         "load_factor" : 1,
         "peer_disconnects" : "1362",
         "peer_disconnects_resources" : "0",
         "peers" : 10,
         "pubkey_node" : "n9Mvv8xpTnbynEhhkFZaUr3QZRvfptXyLhrTMnnQcMBKaxM4ckBh",
         "pubkey_validator" : "none",
         "server_state" : "full",
         "state_accounting" : {
            "connected" : {
               "duration_us" : "35258017813",
               "transitions" : 394
            },
            "disconnected" : {
               "duration_us" : "1567698",
               "transitions" : 1
            },
            "full" : {
               "duration_us" : "1346226905976",
               "transitions" : 325
            },
            "syncing" : {
               "duration_us" : "2927279571",
               "transitions" : 367
            },
            "tracking" : {
               "duration_us" : "2726456005",
               "transitions" : 699
            }
         },
         "uptime" : 1386518,
         "validated_ledger" : {
            "age" : 6,
            "base_fee_xrp" : 1e-05,
            "hash" : "6F337B017FF4661944CF902713A39791AA1BE2F51E761437BF01CB4A31B7F0EB",
            "reserve_base_xrp" : 20,
            "reserve_inc_xrp" : 5,
            "seq" : 6509825
         },
         "validation_quorum" : 7,
         "validator_list_expires" : "2018-Feb-14 00:00:00"
      },
      "status" : "success"
   }
}

[ripple@XRP-RIPPLE-DEV build]$ iostat -xNm 3
Linux 3.10.0-693.el7.x86_64 (XRP-RIPPLE-DEV)    2018年02月07日  _x86_64_        (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.58    0.00    0.21    0.82    0.00   98.39

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               4.28     5.00   11.44    4.39     0.09     0.24    43.92     0.30   18.87    9.98   42.03   1.58   2.50
scd0              0.00     0.00    0.00    0.00     0.00     0.00   114.22     0.00    1.56    1.56    0.00   1.22   0.00
centos-root       0.00     0.00    0.21    0.15     0.01     0.00    80.57     0.02   66.58   60.28   75.36   4.27   0.16
centos-swap       0.00     0.00    5.21    5.18     0.02     0.02     8.00     0.34   31.74   16.03   47.53   0.44   0.46
centos-home       0.00     0.00   10.31    4.13     0.06     0.22    40.39     0.26   17.88    8.51   41.28   1.55   2.24

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.63    0.00    0.17    0.00    0.00   99.21

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
centos-root       0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
centos-swap       0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
centos-home       0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
nload
Incomming
                                    Curr: 1014.02 kBit/s
                                    Avg: 875.29 kBit/s
                         .          Min: 210.74 kBit/s
              .. .|.|#. .##.. .|.|  Max: 1.40 MBit/s
           ..##########|##########  Ttl: 381.31 GByte
Outgoing:

                                    Curr: 448.59 kBit/s
                                    Avg: 290.80 kBit/s
                                    Min: 63.24 kBit/s
                                    Max: 570.10 kBit/s
            ...#.....|. ..|..   .|  Ttl: 73.47 GByte

htop-ripple

MarkusTeufelberger commented 6 years ago

Where do you see the high memory usage, or rather... what do you consider "high"?

passionofvc commented 6 years ago

hi @MarkusTeufelberger last htop picture show 7.24G/7.64G, centos server 8G memory, and only run rippled daemon .

MarkusTeufelberger commented 6 years ago

Set the node_size to tiny or use more RAM.

gituser commented 6 years ago

yes, rippled is very hungry for memory. even if you set node_size = tiny it eats a lot of memory

also it loves to rape your I/O pretty much, so be prepared that your SSDs/HDDs will be constantly busy.

here is what I got with node_size = tiny (VM is 16GB):

20828 ripple    20   0 9773.8m 8.579g 697320 S  15.7 53.6  18:42.80 rippled: main 
gituser commented 6 years ago

New rippled v0.90.0 stopped working reliably (constant 100% cpu load and i/o) with RocksDB engine with the same settings v0.81.0 worked.

I've switched to database engine NuDB for now, but from iotop and top I can see that rippled is constantly using all available i/o there. Is there anything can be done about that?

gituser commented 6 years ago

seems a new version doesn't work at all on HDD.

after running for 24 hours it got stock with

{
   "result" : {
      "error" : "noNetwork",
      "error_code" : 17,
      "error_message" : "InsufficientNetworkMode",
      "request" : {
         "account" : "-",
         "command" : "account_info",
         "ledger_index" : "validated",
         "method" : "account_info"
      },
      "status" : "error"
   }
}

Rippled is with constant load 400% cpu (all cores) and 16GB memory.

The node is tiny and uses NuDB as you recommend.

[node_size]
#
#   Tunes the servers based on the expected load and available memory. Legal
#   sizes are "tiny", "small", "medium", "large", and "huge". We recommend
#   you start at the default and raise the setting if you have extra memory.
#   The default is "tiny".
tiny

[ledger_history]
256

[node_db]
type=NuDB
path=/home/ripple/.ripple/db/nudb
online_delete=2000
advisory_delete=0

I've downgraded to 0.81.0 for now and RocksDB which works just fine.

Could you look into it @wilsonianb ?

gituser commented 6 years ago

there is also another claim from some user - https://groups.google.com/forum/#!topic/ripple-server/4O3hk-OnMI0

for me v0.90.0 is not usable at all on hdds

MarkusTeufelberger commented 6 years ago

NuDB can't be used on HDDs.

gituser commented 6 years ago

@MarkusTeufelberger as I said before new version v0.90.0 doesn't work with RocksDB for some reason properly. Constant load and high memory usage.

Is there any possibility the update broke something related to RocksDB engine?

I've checked debug.log there are lots of lines like:

2018-Feb-27 02:59:33 Validations:WRN Unable to determine hash of ancestor seq=1 from ledger hash=5A944118AF413EEAE093C3BE15FE9657D144F4D0203A7AD2EEFAF8CB11A2AC33 seq=36855529
2018-Feb-27 02:59:33 Validations:WRN Unable to determine hash of ancestor seq=1 from ledger hash=5A944118AF413EEAE093C3BE15FE9657D144F4D0203A7AD2EEFAF8CB11A2AC33 seq=36855529
2018-Feb-27 02:59:33 Validations:WRN Unable to determine hash of ancestor seq=1 from ledger hash=80371445534F105BB12ED60CC5340C8BCB15F0DF89C36A5B0A93804044FCE657 seq=36860383
.....
bazzilio commented 5 years ago

On the version 1.0.1, ripled still eat more than 10Gb memory

nbougalis commented 5 years ago

There are several improvements that are planned to reduce the amount of memory used. Stay tuned under "Issues" for a new issue.