Closed hippo-dalaoshe closed 5 years ago
Can you please show us the output of the
server_info
command (ref)?
I have restart, and thing seems to be normal...
I only remember when this problem happen, my server_info
show that the server_state
is connected
, and complete_ledger
is at 14160000-16090545
but in fact, the latest ledger is about 16150000
..
And I find the same problem New quorum of 18446744073709551615 exceeds the number of trusted validators
happen on my mainnet node, the complete_ledger
show there is some ledger lost.
I think there is consensus problem happen or ledger miss out, but I cannot reproduce it right now..
Can you please show us the output of the
server_info
command (ref)?
I think my explanation has confused you.
I have meet two errors when run rippled server
, one hapeens at the testnet
, and the other happens at the mainnet
. Both of them result in ledgers missing, so my application depending on the sequence ledger, cannot work normally. I think my rippled server
of both testnet
and mainnet
are configured correct, the problem is at the ledger process during consensus or persist.
I check the debug.log
of both of them, they all have the error info like this
New quorum of 18446744073709551615 exceeds the number of trusted validators
.
And in testnet
, the rippled server
can only receive new ledger from peer but cannot consensus and persist it to the database, the server_info
command show that the state is at connected
and completed_ledger
is at range 14160000-16090545
which is far behind the latest ledger seq about 16150000
And in mainnet
, the rippled server
seems to continue run in full
state, but the completed_ledgers
look like this 43185555-43301323, 43301327-4340****
, there are not only one interval, and between interval, there are some ledger missing.
Now I have restart both the testnet
and mainnet
server.
In testnet
, the rippled server
restore immediatelly, and the complete_ledgers
is restarted from the almost latest ledger seq, now it is at 16166627-16377448
.
But unfortunately, in mainnet
, it has took almost two day to restarted, but now, it is still at connected
, the detail server_info
of it is bellow:
{
"result" : {
"info" : {
"build_version" : "1.1.1",
"closed_ledger" : {
"age" : 5,
"base_fee_xrp" : 1e-05,
"hash" : "9B651C8AB97DA84D38C942E25F080B9258BCECF2675F07FCE3A0B97012C84525",
"reserve_base_xrp" : 200,
"reserve_inc_xrp" : 50,
"seq" : 17635
},
"complete_ledgers" : "empty",
"hostid" : "77b7488105af",
"io_latency_ms" : 1,
"jq_trans_overflow" : "0",
"last_close" : {
"converge_time_s" : 5.007,
"proposers" : 26
},
"load" : {
"job_types" : [
{
"job_type" : "untrustedProposal",
"peak_time" : 10,
"per_second" : 46
},
{
"in_progress" : 2,
"job_type" : "ledgerData",
"waiting" : 65
},
{
"in_progress" : 1,
"job_type" : "clientCommand"
},
{
"job_type" : "transaction",
"peak_time" : 5,
"per_second" : 15
},
{
"job_type" : "batch",
"per_second" : 6
},
{
"job_type" : "advanceLedger",
"peak_time" : 12,
"per_second" : 11
},
{
"job_type" : "fetchTxnData",
"peak_time" : 2,
"per_second" : 8
},
{
"job_type" : "trustedValidation",
"peak_time" : 13,
"per_second" : 4
},
{
"job_type" : "writeObjects",
"peak_time" : 6,
"per_second" : 4
},
{
"job_type" : "trustedProposal",
"peak_time" : 2,
"per_second" : 11
},
{
"avg_time" : 1,
"job_type" : "heartbeat",
"peak_time" : 2
},
{
"job_type" : "peerCommand",
"peak_time" : 1,
"per_second" : 693
},
{
"job_type" : "diskAccess",
"peak_time" : 5,
"per_second" : 4
},
{
"job_type" : "processTransaction",
"per_second" : 7
},
{
"job_type" : "AsyncReadNode",
"peak_time" : 93,
"per_second" : 1851
}
],
"threads" : 4
},
"load_factor" : 1,
"peer_disconnects" : "18",
"peer_disconnects_resources" : "0",
"peers" : 10,
"pubkey_node" : "n9J5DucjxQqSJaRWFPJcP7FqTfW8jiiJoQgbQ7nCert2HUrSHwr3",
"pubkey_validator" : "none",
"published_ledger" : "none",
"server_state" : "connected",
"state_accounting" : {
"connected" : {
"duration_us" : "74699859850",
"transitions" : 1
},
"disconnected" : {
"duration_us" : "1312716",
"transitions" : 1
},
"full" : {
"duration_us" : "0",
"transitions" : 0
},
"syncing" : {
"duration_us" : "0",
"transitions" : 0
},
"tracking" : {
"duration_us" : "0",
"transitions" : 0
}
},
"time" : "2019-Jan-25 01:57:26.578305",
"uptime" : 74701,
"validation_quorum" : 21,
"validator_list" : {
"count" : 1,
"expiration" : "2019-Jan-31 00:00:00.000000000",
"status" : "active"
}
},
"status" : "success"
}
}
I think it is because the local database is dirty, and cannot acquire the miss ledger from peers. I can only restart it by delete the local database manually. And my db config
is bellow:
[node_db]
type=RocksDB
path=/var/rippled/lib/rippled/db/rocksdb
open_files=2000
filter_bits=12
cache_mb=256
file_size_mb=8
file_size_mult=2
online_delete=200000
advisory_delete=1
[ledger_history]
150000
I want to save the recently ledgers about two weeks in order to support my application. But these problems will cause the important ledger data miss and interrupt my application, can you give me some help to solve it?
I suspect this was caused by lack of peer connections on the test net. If you don't have enough peer connections who are on the same net as you (i.e. not "insane") then it can be hard for your server to stay synced.
If you are behind a firewall and you don't open the peer protocol port (51235 by default), then you can only rely on outbound peers, who tend to be busier and may drop you as a peer. It's also possible you may end up connected to test net peers only when you want to be on the main net, or vice versa (look for "insane
in the peers
response)
So two actions you can take to reduce connectivity-related problems are:
"type": "in"
peers in the Peer Crawler response.Anyway, this issue hasn't been updated in a while, so I'm closing it as stale, but feel free to reopen if you are still having problems.
excuse me, I meet the same problem
New quorum of 18446744073709551615 exceeds the number of trusted validators
the detail info in debug.log is as follow:
you can see that, before
2019-Jan-14 23:51:13
everything is ok, and my node can receive and persist the closed ledger normally. but at2019-Jan-14 23:51:27
thePeer:WRN [325098] onReadMessage: Connection reset by peer
message happen, and then at2019-Jan-15 00:00:00
it tell meNew quorum of 18446744073709551615 exceeds the number of trusted validators
, and at2019-Jan-15 00:00:23
, my node is atView of consensus changed during open status=open, mode=wrongLedger
, after that my node can only receive closed ledger but cannot sync it.my node is run at
ubuntu18.04
in testnet withrippled 1.1.1
I have restart the node, and lose all the ledger I have sync, can you tell me how to avoid this problem?Originally posted by @hippo-dalaoshe in https://github.com/ripple/rippled/issues/2611#issuecomment-456638545