eos 1.3.0 - new mainnet API nodes with mongo_db_plugin enabled is 3-4 weeks

nelsonenzo commented 6 years ago

Trying to run an API node off of the mainnet, upgraded to eos version 1.3.0 (1.2.5 was just as bad, but the latest is bad too). I was able to successfully bring up a node last week (at 14mm blocks) without the mongo_db_plugin enabled, but it now looks like a major blocker and the node may never catch up.

To what scale (block directory size) was the mongo_db_plugin ever tested?
Should we be expecting these scaling problems right now?

I'm running two instances - one for mongo one for nodeos. Please remember these are read only api nodes, not block producers. Both have the following specs:

32GB memory,
8vCPU's (4cores@2.5ghz),
200gb disk drive space + 1000 provisioned IOPS.

Using --hard-replay from a downloaded data directory (9/23 backup date) started around 5pm yesterday. It was originally processing at about 1000blocks/second - which was a fine pace, it should have caught up by 2am - but as of this morning at 8am it is only at block 6MM, and processing at about 1000blocks/2minutes. I suspect it will slow to about 3 minutes per block around 8million, sometime around noon today. (8,000,000/1,000) * 2 minutes = 16,000 minutes, or 11.11 days.

That assumes the process time does not go down further, which would seem odd. Also does not account for the growing blocks size.

I have cut a couple of fields out from the log lines to make it easier to read here on github, data doesn't change though. 2 minutes for mongo to process 1000 blocks.

eos@ip-172-22-80-165:~/mainnet$ tail -n 500 log.txt | grep block_num
15:34:58.346 mongo_db_plugin.cpp:900 block_num: 5856000
15:37:20.704 mongo_db_plugin.cpp:900 block_num: 5857000
15:39:05.830 mongo_db_plugin.cpp:900 block_num: 5858000
15:41:05.686 mongo_db_plugin.cpp:900 block_num: 5859000
15:43:17.178 mongo_db_plugin.cpp:900 block_num: 5860000
15:45:25.897 mongo_db_plugin.cpp:900 block_num: 5861000
15:47:32.501 mongo_db_plugin.cpp:900 block_num: 5862000
15:49:49.949 mongo_db_plugin.cpp:900 block_num: 5863000
15:51:45.084 mongo_db_plugin.cpp:900 block_num: 5864000

mongodb-uri = mongodb://eosuser:eospassword@my.mongo.ip:27017/EOS
mongodb-filter-on  = *
mongodb-filter-out = spammer::
mongodb-queue-size = 2048
mongodb-abi-cache-size = 2048

Contrast the above logs with how quickly it was processing in the early numbers - several 1000 blocks a minute:

04:18:09.374  mongo_db_plugin.cpp:900  block_num: 1157000
04:18:25.259  mongo_db_plugin.cpp:900  block_num: 1158000
04:18:39.471  mongo_db_plugin.cpp:900  block_num: 1159000
04:18:44.045  mongo_db_plugin.cpp:900  block_num: 1160000
04:18:49.605  mongo_db_plugin.cpp:900  block_num: 1161000
04:18:55.354  mongo_db_plugin.cpp:900  block_num: 1162000
04:19:01.120  mongo_db_plugin.cpp:900  block_num: 1163000
04:19:02.663  mongo_db_plugin.cpp:900  block_num: 1164000
04:19:08.508  mongo_db_plugin.cpp:900  block_num: 1165000
04:19:14.246  mongo_db_plugin.cpp:900  block_num: 1166000
04:19:20.138  mongo_db_plugin.cpp:900  block_num: 1167000
04:19:21.693  mongo_db_plugin.cpp:900  block_num: 1168000
04:19:29.739  mongo_db_plugin.cpp:900  block_num: 1169000
04:19:35.537  mongo_db_plugin.cpp:900  block_num: 1170000
04:19:41.240  mongo_db_plugin.cpp:900  block_num: 1171000
04:19:42.766  mongo_db_plugin.cpp:900  block_num: 1172000
04:19:48.510  mongo_db_plugin.cpp:900  block_num: 1173000
04:19:54.328  mongo_db_plugin.cpp:900  block_num: 1174000
04:19:55.894  mongo_db_plugin.cpp:900  block_num: 1175000

I suspect someone is going to suggest changing mongodb-filter-on = *, which would be a fine suggestion except for

the lack of documentation on what reasonable filters would be so dApp developers can function (or some documentation on how to figure out what is reasonable for our developers).
why is mongo plugin doing things by default that a reasonable system can not actually handle?
doesn't this effect what I can query on the blockchain? Doesn't that defeat it's primary purpose?

Running ~$1,000 in hardware and not being able to bring up a single read only api node seems not quite right. I am very open to suggestions.

This is blocking for developers, as without an API to query the blockchain, our dApp is missing a significant number of features. Because of the deprecation notice of history_plugin, they refuse to use the history plugin endpoints. Please halp.

brian-maloney commented 6 years ago

In my experience, 1000 IOPS is nowhere near enough to sync from nothing when writing to mongodb (I'm syncing the entire chain with no filters). I find that I need at least 3000-5000 IOPS to get caught up when writing to mongo, and I usually allocate more than that if I can. My best results have been using ephemeral local storage for the initial sync, then copying to the permanent storage location.

brian-maloney commented 6 years ago

It's worth remembering that even a modest desktop SSD can do tens of thousands of IOPS. Cloud providers really limit IOPS as a resource because it's easy for an IOPS-hog to impact other users in a shared environment, but requiring a few thousand IOPS to sync isn't that extreme of a requirement.

nelsonenzo commented 6 years ago

@brian-maloney that is absolutely terrific information and I will look to expand iops for the mongo db initial sync and see how it goes.
I also agree that a few thousand IOPS to sync is not an extreme requirement - however it is not published anywhere as a requirement and is definitely not a reasonable starting point expectation for a fresh mongo db.
Again, this is good information and thank you for sharing.

nelsonenzo commented 6 years ago

Increasing IOPS to 5000 did alleviate some write pressure on the mongodb instance, it's holding steady now at about 1,500 writes/second.

It spead up to about 5,000blocks/minute for a while, but now in the 6.5MM block range and it's back to 1000blocks/minute. At this current rate, it will finish replaying in ~33 hours, and then need to continue syncing.

screen shot 2018-09-24 at 1 20 50 pm

I looked at the timing for block_num over a wide range of time and it appears that certain blocks definitely take more time than other blocks to process. The sync process always seems to slow down around the 5-6MM block range.

Niether the CPU or memory appears to be particularly taxed on either instance. Two configs options I wonder how much they have an impact:

sync-fetch-span = 1000
reversible-blocks-db-size-mb = 340

This is my full config.ini if anything pops out at anyone.

agent-name = myagentname1

http-server-address = 0.0.0.0:8888
p2p-listen-endpoint = 0.0.0.0:9876
validation-mode = light
blocks-dir = "/mnt/eos/blocks"

p2p-max-nodes-per-host = 10
http-validate-host = false
https-client-validate-peers = 1
abi-serializer-max-time-ms = 3000
chain-state-db-size-mb = 32768
reversible-blocks-db-size-mb = 340
contracts-console = false
allowed-connection = any
max-clients = 20
network-version-match = 0
sync-fetch-span = 1000
connection-cleanup-period = 30
max-implicit-request = 1500
access-control-allow-origin = *
access-control-allow-headers = *
access-control-allow-credentials = false
verbose-http-errors = true

plugin = eosio::chain_plugin
plugin = eosio::chain_api_plugin
plugin = eosio::history_plugin
plugin = eosio::history_api_plugin
plugin = eosio::mongo_db_plugin 
mongodb-uri = mongodb://eosuser:eospassword@my.mongo.ip:27017/EOS
mongodb-filter-on  = *
mongodb-filter-out = spammer::
mongodb-queue-size = 2048
mongodb-abi-cache-size = 2048

p2p-peer-address = eos-mainnet-p2p.activeeos.com:9876
p2p-peer-address = peer.main.alohaeos.com:9876
p2p-peer-address = p2p.eosargentina.io:5222
p2p-peer-address = eosbp-0.atticlab.net:9876
p2p-peer-address = p2p.eos.blckchnd.com:19876
p2p-peer-address = p2p1.bp2.io:4444
p2p-peer-address = node1.bp.eosindex.io:9876
p2p-peer-address = bp.cryptolions.io:9876
p2p-peer-address = dc1.eosemerge.io:9876
p2p-peer-address = cpt.eosio.africa:9876
p2p-peer-address = 159.65.214.150:9876
p2p-peer-address = peering.mainnet.eoscanada.com:9876
p2p-peer-address = node1.eoscannon.io:59876
p2p-peer-address = mainnet.eoseco.com:10010
p2p-peer-address = p2p.bitmars.one:8080
p2p-peer-address = p2p.eosgermany.online:9876
p2p-peer-address = node1.eoshorizon.ca:9441
p2p-peer-address = p2p.prod.eosgravity.com:80
p2p-peer-address = fullnode.eoslaomao.com:443
p2p-peer-address = node2.liquideos.com:9876
p2p-peer-address = mainnet.eosnairobi.io:9876
p2p-peer-address = peer.eosn.io:9876

One of my challenges to debugging is that I do not know how to stop the nodeos process in the middle of a --hard-replay without starting all over again. If I start it without --hard-replay it always comes back with a dirty state. That means each iteration is several hours in the making for the smallest config change. If anyone knows a better way, please do let me know.

matthewdarwin commented 6 years ago

There is no way to restart while doing a --hard-replay... I asked the same thing a few weeks ago when trying to get my mongo running. Mine took 10+ days to replay all the blocks from scratch (on 1.2.5)

nelsonenzo commented 6 years ago

@matthewdarwin thank you! Even confirmation of what I can not do is helpful, and it's good to also know that others are experiencing the same time sink when starting a fresh node. Kudos.

SpadeRoy commented 6 years ago

same error, so slow

brian-maloney commented 6 years ago

Be sure you're running on non-Xen hypervisors. The version of Xen AWS is using has a performance issue with the clock_gettime syscall (documented here: https://blog.packagecloud.io/eng/2017/03/08/system-calls-are-much-slower-on-ec2/)

However even with all factors in my favor I have still seen unexplained slowdown beyond 8-9 million blocks. My initial syncs (unfiltered to mongo) took 4-5 days to complete. Nodeos stopped using a full single core but CPU , memory and IOPS resources were fine, so I was wondering if it might be disk write bandwidth. I am tempted to try this on a bare-metal instance to see if it behaves any differently but have not gotten around to trying it yet.

nelsonenzo commented 6 years ago

I'll look into the clock_gettime issue, but given how little taxed resources are currently, I do not think it's the issue. I have --hard-replayed the EOS blockchain without mongo_db_plugin a week ago, and it worked fine (without iops even). The real blocker appears to be mongo.

Even though the blocks directory is only 50gb, my mongodb tapped out with 100gb storage. So, another spec for those reading, you will need more than 100gb for the mongo database.

matthewdarwin commented 6 years ago

My test mongo is 622GB (with no filters applied)

nelsonenzo commented 6 years ago

@matthewdarwin another wonderful data point. Thank you! I bumped the storage to 500gb and things are working again for now, thankfully it doesn't appear that I need to start from scratch.
On the flip side, blocks are only processing at ~750blocks/minute, or in other words, 9.25 days to catch up at this pace :/ With your info I see I will need to expand further and I'm glad to be aware of it upfront.

brian-maloney commented 6 years ago

I went ahead and did a test on an AWS i3.metal instance, same behavior. Most of the time once it gets slow I don't see any threads consuming full CPU (neither user, nor sys, nor I/O), except when mongo kicks off WTCheck.tThread, which runs at 100% briefly then exits. Definitely odd behavior, it will take some deep digging to figure it out.

sensay-nelson commented 6 years ago

I'm at block 7,900,000 and it is processing at 500 blocks/minute as of this morning.

The culprit appears to be whatever the mongo_db_plugin.cpp lines 451 and 464 - taking almost 1 minute of time each respectively.

process_accepted_block,       time per: 52119,
process_irreversible_block,   time per: 50489,

based on logs

2018-09-26T16:21:51.883 mongo_db_plugin.cpp:900  block_num: 7951000
2018-09-26T16:21:54.271 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1050, size: 2062, time: 2165268
2018-09-26T16:21:55.826 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 753, size: 2063, time: 1554239
2018-09-26T16:21:58.475 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1006, size: 2063, time: 2076508
2018-09-26T16:22:00.014 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 745, size: 2062, time: 1537979
2018-09-26T16:22:02.591 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1024, size: 2062, time: 2113401
2018-09-26T16:22:04.136 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 748, size: 2063, time: 1544950
2018-09-26T16:22:06.627 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1009, size: 2062, time: 2082183
2018-09-26T16:22:08.157 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 741, size: 2062, time: 1529392
2018-09-26T16:22:10.869 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1018, size: 2063, time: 2100210
2018-09-26T16:22:12.374 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 729, size: 2063, time: 1504585
2018-09-26T16:22:15.128 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1045, size: 2063, time: 2155891
2018-09-26T16:22:16.791 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 806, size: 2062, time: 1662604
2018-09-26T16:22:19.301 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1000, size: 2062, time: 2063259
2018-09-26T16:22:20.920 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 784, size: 2063, time: 1618921
2018-09-26T16:22:22.067 mongo_db_plugin.cpp:451  process_accepted_block,       time per: 52119, size: 22, time: 1146632
2018-09-26T16:22:23.178 mongo_db_plugin.cpp:464  process_irreversible_block,   time per: 50489, size: 22, time: 1110767
2018-09-26T16:22:30.132 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 3365, size: 2066, time: 6954010
2018-09-26T16:22:31.673 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 746, size: 2065, time: 1540997
2018-09-26T16:22:40.025 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 3798, size: 2069, time: 7859804
2018-09-26T16:22:41.582 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 752, size: 2069, time: 1557789
2018-09-26T16:22:44.361 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 985, size: 2071, time: 2041792
2018-09-26T16:22:45.921 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 753, size: 2071, time: 1560101
2018-09-26T16:22:48.325 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 980, size: 2062, time: 2022624
2018-09-26T16:22:49.853 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 741, size: 2062, time: 1528623
2018-09-26T16:22:52.385 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1006, size: 2062, time: 2075169
2018-09-26T16:22:53.941 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 754, size: 2062, time: 1555799
2018-09-26T16:22:56.566 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1027, size: 2062, time: 2118030
2018-09-26T16:22:58.100 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 743, size: 2063, time: 1533801
2018-09-26T16:23:00.564 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1008, size: 2062, time: 2078990
2018-09-26T16:23:02.087 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 738, size: 2062, time: 1522299
2018-09-26T16:23:04.697 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1029, size: 2063, time: 2123730
2018-09-26T16:23:06.272 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 764, size: 2062, time: 1575387
2018-09-26T16:23:09.155 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1016, size: 2063, time: 2097564
2018-09-26T16:23:10.686 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 742, size: 2063, time: 1531091
2018-09-26T16:23:13.379 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1008, size: 2065, time: 2081655
2018-09-26T16:23:14.952 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 761, size: 2066, time: 1572620
2018-09-26T16:23:17.522 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1020, size: 2064, time: 2105318
2018-09-26T16:23:19.078 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 753, size: 2064, time: 1556079
2018-09-26T16:23:21.653 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1015, size: 2069, time: 2100241
2018-09-26T16:23:23.236 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 765, size: 2068, time: 1583635
2018-09-26T16:23:25.735 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 972, size: 2062, time: 2005002
2018-09-26T16:23:27.278 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 747, size: 2063, time: 1543056
2018-09-26T16:23:29.791 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1015, size: 2069, time: 2101607
2018-09-26T16:23:31.345 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 750, size: 2069, time: 1553670
2018-09-26T16:23:34.777 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1379, size: 2065, time: 2847908
2018-09-26T16:23:39.576 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 2325, size: 2064, time: 4799507
2018-09-26T16:23:42.167 mongo_db_plugin.cpp:426  process_applied_transaction,  time per: 1008, size: 2072, time: 2090521
2018-09-26T16:23:51.464 mongo_db_plugin.cpp:438  process_accepted_transaction, time per: 4486, size: 2072, time: 9296791
2018-09-26T16:23:51.467 mongo_db_plugin.cpp:900  block_num: 7952000

If I look at the mongodb logs themselves, nothing appears to be taking more than 2-3 seconds, and that is a minority of transactions, most transactions are in the 2-300ms range as one would expect from mongo. I feel like there is something the plugin is doing that takes an increasing amount of time the more blocks that are processed.

brian-maloney commented 6 years ago

I am pretty sure it has more to do with the number of transactions in the block, not the number of blocks that had been processed.

sensay-nelson commented 6 years ago

@brian-maloney well that would make more sense XD I think I'm giving up on brute force with a * filter. A distributed set of nodeos+mongo's for different filter sets seem like the only reasonable approach at this time. Devs then need to know what db to query for what type of object, but i think that would be ok.

taokayan commented 6 years ago

I guess you may want to filter out blocktwitter as well. SSD with high IPOS is required to sync-up with mongo-db.

helperShang commented 6 years ago

I used a month and a half months

nelsonenzo commented 6 years ago

Update: Doing a bit better / almost in sync after about 4 days and it's moving along at a rate that looks like it will catch up in another 1-2 days.

Using two i3.xlarge instances in AWS which have 4vCPU and 900GB ssd memory has helped significantly. One instance if running nodeos, the other mongodb. No longer limited by iops and enough disk storage for the blocks and mongodb.

The other crucial improvement is from including a better mongodb filter set. I knew a filter set was advised and catching up with a * filter was a big wish, but I never found good documentation on filter sets. With some advice from a colleauge, I settled on the following mongo configurations. I believe the mongo-filter-out were the real game changers and I might suggest adding to documentation.

plugin = eosio::mongo_db_plugin
mongodb-uri = mongodb://eosuser:eospassword@mongo.internal.company.com:27017/EOS
mongodb-queue-size = 2048
mongodb-abi-cache-size = 2048
mongodb-filter-on  = :transfer:
mongodb-filter-out = eosio:onblock:
mongodb-filter-out = gu2tembqgage::
mongodb-filter-out = blocktwitter::
mongodb-filter-out = spammer::

With that in place, it has taken about 4 days of processing to catch up, with 1-2 more left to go. It appears to be processing at a clip that will catch up. I'll update this ticket again once there. Thank you again for everyone that has contributed knowledge up tto this point.

austinHodge commented 6 years ago

@nelsonenzo thanks for all the info in this thread. Have you been able to stay caught up with this configuration? Also, I'm curious about the i3 instances. From my understanding, these use ephemeral NVMe SSDs. Do you expect to have to re-sync entirely if your instance stops? If not, what is your backup / restore strategy considering that your db will be out of sync with your EOS node?

nelsonenzo commented 6 years ago

@austinHodge sorry for the slow response. It did catch up and ran smoothly once caught up. These are indeed ephemeral drives. For redundancy, I'm going to run 2 sets of nodeos+mongo on these i3's.

One working theory I have is to turn one set off - nodeos off, then mongo off - and copy the data to a new instances, and in theory, restart them both. I tried this yesterday, but admittedly made a mistake and accidentally restarted nodeos with --hard-replay. Apparently that backup of the blocks directory it makes when you use --hard-replay is not useful as an actual backup. The peers just tell me to go away, wrong chain now. If done correctly, I can usually stop nodeos and restart it gracefully.

It's worth noting that once all caught up, the disk resources are quite normal. If I can get the copy and restart process down, I can likely move off of these i3's and use ebs backed devices. That would make it much easier to stop processes and just take snapshots of the mounted storage - they have to be snapshotted in pairs and with the processes off for sure. It sadly takes one week of these servers running to get one shot at getting it right ::smh::

I just created a telegram channel https://t.me/nodeOSadmins if anyone whom comes down this tread would like to join and ask questions/provide experience running nodeOS from a system perspective.

austinHodge commented 6 years ago

@nelsonenzo thanks again for the info. I just joined the telegram channel, and I'll let you know if I discover anything interesting. Btw, I'm documenting all of the steps I'm going through and will publish them once things are running well.

MarcelBlockchain commented 6 years ago

v1.4 introduced the snapshot feature. You can already see the folder next to the 'state' folder. Easy catch up of the whole blockchain, all verified. The BP's are working on that right now. I can imagine CryptoLions and EOS Canada could be the first offering to download snapshots. See eosio v1.4 announcement. If anyone found a BP offering that feature, let us know here and on this Stackexchange thread. Thanks

I'm curious if it still requires us to sync the monodb for days. For replaying I use https://eosnode.tools/blocks , which downloads the whole blockchain in 40min if you have a server close to Dublin (source S3 & Google) + days for the mongodb catch up.

I have some improvements for the mongodb-store-... default is set to ON, so set it to 0. When I had everything on and queried the DB for tx using CryptoLions/EOS-mongo-history-API, it returned it 3 times. More info from Heifner here

I'd be happy to find a minimalistic config.ini for the community and dApp developers. Let me know if you find any more improvements.

agent-name = eosbackend

http-server-address = 0.0.0.0:8888
p2p-listen-endpoint = 0.0.0.0:9876

bnet-endpoint = 0.0.0.0:4321
blocks-dir = "/mnt/volume_lon1_02/mainnet/blocks"

bnet-follow-irreversible = 0
bnet-no-trx = false
read-mode = read-only
validation-mode = light
mongodb-uri = mongodb://127.0.0.1:27017/EOS
mongodb-queue-size = 2048
mongodb-abi-cache-size = 2048
mongodb-block-start = 1
mongodb-store-blocks = 0
mongodb-store-transactions = 0
mongodb-store-transaction-traces = 0
mongodb-store-action-traces = 1
mongodb-filter-on  = :transfer:
mongodb-filter-out = eosio:onblock:
mongodb-filter-out = gu2tembqgage::
mongodb-filter-out = blocktwitter::
mongodb-filter-out = spammer::

wasm-runtime = wabt
p2p-max-nodes-per-host = 1
http-validate-host = false
https-client-validate-peers = 1
abi-serializer-max-time-ms = 10000
chain-state-db-size-mb = 32000
reversible-blocks-db-size-mb = 340
contracts-console = false
allowed-connection = any
max-clients = 100
network-version-match = 0
sync-fetch-span = 500
connection-cleanup-period = 30
max-implicit-request = 1500

access-control-allow-origin = *
access-control-allow-headers = *
access-control-allow-credentials = false
verbose-http-errors = true

plugin = eosio::chain_plugin
plugin = eosio::chain_api_plugin
plugin = eosio::bnet_plugin
plugin = eosio::mongo_db_plugin
plugin = eosio::http_plugin

MarcelBlockchain commented 6 years ago

@austinHodge sorry for the slow response. It did catch up and ran smoothly once caught up. These are indeed ephemeral drives. For redundancy, I'm going to run 2 sets of nodeos+mongo on these i3's.

One working theory I have is to turn one set off - nodeos off, then mongo off - and copy the data to a new instances, and in theory, restart them both. I tried this yesterday, but admittedly made a mistake and accidentally restarted nodeos with --hard-replay. Apparently that backup of the blocks directory it makes when you use --hard-replay is not useful as an actual backup. The peers just tell me to go away, wrong chain now. If done correctly, I can usually stop nodeos and restart it gracefully.

It's worth noting that once all caught up, the disk resources are quite normal. If I can get the copy and restart process down, I can likely move off of these i3's and use ebs backed devices. That would make it much easier to stop processes and just take snapshots of the mounted storage - they have to be snapshotted in pairs and with the processes off for sure. It sadly takes one week of these servers running to get one shot at getting it right ::smh::

I just created a telegram channel https://t.me/nodeOSadmins if anyone whom comes down this tread would like to join and ask questions/provide experience running nodeOS from a system perspective.

Thanks for the detailed work and ongoing feedback! Do you want to change https://t.me/nodeOSadmins from being a 'channel' to a 'group', so other users can message as well?

jgiszczak commented 5 years ago

This thread seems to have run its course. Github will preserve it for future reference. Discussions of a similar nature should in future be conducted on Stack Exchange. To keep things organized, Github issues are intended for bugs reporting and features tracking, and Stack Exchange is the right place for any technical support discussions. Thanks!

keithyau commented 5 years ago

v1.4 introduced the snapshot feature. You can already see the folder next to the 'state' folder. Easy catch up of the whole blockchain, all verified. The BP's are working on that right now. I can imagine CryptoLions and EOS Canada could be the first offering to download snapshots. See eosio v1.4 announcement. If anyone found a BP offering that feature, let us know here and on this Stackexchange thread. Thanks

I'm curious if it still requires us to sync the monodb for days. For replaying I use https://eosnode.tools/blocks , which downloads the whole blockchain in 40min if you have a server close to Dublin (source S3 & Google) + days for the mongodb catch up.

I have some improvements for the mongodb-store-... default is set to ON, so set it to 0. When I had everything on and queried the DB for tx using CryptoLions/EOS-mongo-history-API, it returned it 3 times. More info from Heifner here

I'd be happy to find a minimalistic config.ini for the community and dApp developers. Let me know if you find any more improvements.
agent-name = eosbackend

http-server-address = 0.0.0.0:8888
p2p-listen-endpoint = 0.0.0.0:9876

bnet-endpoint = 0.0.0.0:4321
blocks-dir = "/mnt/volume_lon1_02/mainnet/blocks"

bnet-follow-irreversible = 0
bnet-no-trx = false
read-mode = read-only
validation-mode = light
mongodb-uri = mongodb://127.0.0.1:27017/EOS
mongodb-queue-size = 2048
mongodb-abi-cache-size = 2048
mongodb-block-start = 1
mongodb-store-blocks = 0
mongodb-store-transactions = 0
mongodb-store-transaction-traces = 0
mongodb-store-action-traces = 1
mongodb-filter-on  = :transfer:
mongodb-filter-out = eosio:onblock:
mongodb-filter-out = gu2tembqgage::
mongodb-filter-out = blocktwitter::
mongodb-filter-out = spammer::

wasm-runtime = wabt
p2p-max-nodes-per-host = 1
http-validate-host = false
https-client-validate-peers = 1
abi-serializer-max-time-ms = 10000
chain-state-db-size-mb = 32000
reversible-blocks-db-size-mb = 340
contracts-console = false
allowed-connection = any
max-clients = 100
network-version-match = 0
sync-fetch-span = 500
connection-cleanup-period = 30
max-implicit-request = 1500

access-control-allow-origin = *
access-control-allow-headers = *
access-control-allow-credentials = false
verbose-http-errors = true

plugin = eosio::chain_plugin
plugin = eosio::chain_api_plugin
plugin = eosio::bnet_plugin
plugin = eosio::mongo_db_plugin
plugin = eosio::http_plugin

Hi, new pie here.

is that this config will filtered many things that suppose to write into Mongo, for example, transactions table ?

EOSIO / eos

eos 1.3.0 - new mainnet API nodes with mongo_db_plugin enabled is 3-4 weeks #5797