IntersectMBO / cardano-db-sync

A component that follows the Cardano chain and stores blocks and transactions in PostgreSQL
Apache License 2.0
291 stars 160 forks source link

200% CPU usage and constantly lagging behind tip by ~80 blocks #754

Closed CherryDT closed 8 months ago

CherryDT commented 3 years ago

I am using the cardano-graphql Docker setup. The issue is that since I upgraded it to 5.0.0, the sync is no longer working properly.

The initial sync took ages, and now the situation is that it reports as fully synced but in reality it's always lagging ~80 blocks behind the tip, and the cardano-db-sync process is using 200% CPU (2 out of 4 cores) and ~9 GB RAM. Note that the remaining two cores are mostly idle.

The system is on a server well beyond the minimum spec listed here. It's a r5a.xlarge AWS server, with Ubuntu 20, 4 CPU cores, 32 GB RAM and 160 GB harddisk.

The logs for cardano-db-sync-extended seem to show normal operation (but if you compare the block numbers to a block explorer you'll see it's lagging behind):

cardano-db-sync-extended_1  | [db-sync-node:Info:16454] [2021-08-17 05:22:45.20 UTC] insertShelleyBlock(Mary): epoch 284, slot 37609429, block 6122619, hash 215ed4b1f0c6f21de81dabbbde04f1cee5cd22e79020e05d9e54426b6fab7013
cardano-db-sync-extended_1  | [db-sync-node:Info:16454] [2021-08-17 05:23:20.97 UTC] insertShelleyBlock(Mary): continuing epoch 284 (slot 284680/432000)
cardano-db-sync-extended_1  | [db-sync-node:Info:16454] [2021-08-17 05:23:20.97 UTC] insertShelleyBlock(Mary): epoch 284, slot 37609480, block 6122620, hash 132d4ee2a795f06d327ac2a4ae69fb513189f2d5820b55873bd2982f6a49d5bc
cardano-db-sync-extended_1  | [db-sync-node:Info:16454] [2021-08-17 05:23:20.97 UTC] Offline pool metadata fetch: 0 results, 28 fetch errors
cardano-db-sync-extended_1  | [db-sync-node:Info:16454] [2021-08-17 05:24:03.76 UTC] insertShelleyBlock(Mary): epoch 284, slot 37609498, block 6122621, hash 1e595380f8f550373cc3d0de2d20608adc431da68bcae284c80584891c38024e
cardano-db-sync-extended_1  | [db-sync-node:Info:16454] [2021-08-17 05:24:39.46 UTC] insertShelleyBlock(Mary): epoch 284, slot 37609514, block 6122622, hash 1cacfd1015b217c975fb043874e4fce4627eb42bf389535be30a33ff3e17e2c5
cardano-db-sync-extended_1  | [db-sync-node:Info:16454] [2021-08-17 05:25:23.05 UTC] insertShelleyBlock(Mary): epoch 284, slot 37609558, block 6122623, hash ec329c076cd2b7242f31fae7bcc5c67370ce0f3cd9091c29c3809a35794fa3f3
cardano-db-sync-extended_1  | [db-sync-node:Info:16454] [2021-08-17 05:26:06.12 UTC] insertShelleyBlock(Mary): epoch 284, slot 37609581, block 6122624, hash 07fd6200ceab95424f6750d8921f9b3199a51ceb25e324893e75e2606380fea3
cardano-db-sync-extended_1  | [db-sync-node:Info:16454] [2021-08-17 05:26:45.19 UTC] insertShelleyBlock(Mary): epoch 284, slot 37609620, block 6122625, hash c9d2b6bfd5dedd12273a76f46378507ef571093c638fc3658c23988834abf8b1

What can I do to diagnose and fix this?

erikd commented 3 years ago

Is that an electro-magnetic disk or an SSD? EM disks are significantly slower than SSDs.

CherryDT commented 3 years ago

SSD (via EBS)

But before the upgrade, it was working fine on 4.0.0...

erikd commented 3 years ago

Do you have IO throughtput numbers for your storage? And is the storage direct connected to the machine or is it on the LAN?

CherryDT commented 3 years ago

If I understand the AWS EBS docs correctly, this volume would give me 128 MiB/s. I believe it's somehow networked but I never had an issue with that.

Still, that's indeed a bit slow. But why would this have become an issue only after the upgrade?

Also, I tried something else now: upgrading to an instance with 8 cores instead of 4. To my surprise that solved the problem, but it's not really logical to me why that would be the case. (plus, it's very expensive and I'd prefer not to keep it this way...)

Disk speed would have sounded much more logical but the number of CPU cores shouldn't affect the issue that much then, right? I'm confused.

erikd commented 3 years ago

4.0.0 stored much, much less data in the database than 10.0.x . 4 CPU cores is sufficient.

There is simply no reason why 8 cores would be faster. I suspect that its simply a different machine with faster IO.

CherryDT commented 3 years ago

Hm. Thank you, I'll upgrade to a larger volume then (which comes with higher iops as well) and see.

javiertc commented 3 years ago

Hi @CherryDT, are you using gp2 or gp3? Maybe you can try gp3 with higher IOPS and you don't have to increase the volume size.

erikd commented 3 years ago

@CherryDT If this is sorted out now, please close this ticket.

CherryDT commented 3 years ago

It seems the 8 core thing worked only temporarily/randomly, the issue came back.

Anyway, I have switched over to a gp3 volume with 6000 iops and 500mb/s throughput. But, the problem still exists, although it's "a bit" better - it's now lagging only 10-15 blocks behind, but still doesn't fully catch up even after hours. CPU usage is now 90-100% by cardano-db-sync, and 2x 40-50% by two postgres processes (there are more postgres processes but they use less CPU).

Those two processes have the names postgres: postgres cexplorer 172.18.0.5(38560) INSERT and postgres cexplorer 172.18.0.5(39832) UPDATE

erikd commented 3 years ago

It seems the 8 core thing worked only temporarily/randomly, the issue came back.

8 cores is more than double what is actually required, When you first moved machines I suspect you had better IO and that is what improved and then something changed dropping you back to the old performance.

This storage is direct connected to the machine and not via a LAN right?

CherryDT commented 3 years ago

I just checked, it shows up as NVMe device directly.

erikd commented 3 years ago

@rhyslbw ?

CyberCyclone commented 3 years ago

@CherryDT Based on my testing for running ADATools.io, I've found AWS largely unusable / extremely expensive. Sometimes queries timeout or never complete. I couldn't read anywhere what instance type you're using, but keep in mind that AWS loves to use crediting systems. I.e, you can have "burstable" instances but then they drop of dramatically once you've used the credits. While it might say that you're using high CPU resources, it can be deceptive as it can also mean how much is stuck in the queue for the CPU to process but it's waiting for something else (network or disk).

Based on my tests, the 1st thing you need to prioritise is your disk IOPS and 6000 is actually really small for DB usage of this nature. As a reference, DigitalOcean and Vultr can consistently hit above 100,000 SSD IOPS. I've found Vultrs baremetal severs to get your best value for money, and since you have the whole box you have consistent resources. Here's an article that shows the IOPS difference between AWS and DO: https://dzone.com/articles/iops-benchmarking-disk-io-aws-vs-digitalocean

The 2nd thing to keep an eye on is your CPU frequency. Since a lot of it prefers single thread resources, a higher frequency CPU can have better outcomes then multi-cores. That's another reason I prefer DigitalOcean and Vultr and they both offer High Frequency CPU options.

CherryDT commented 3 years ago

Thank you, I'll try DO or Vultr... Maybe the iops and/or speed should be part of the minimum system requirements in the docs then...?

CyberCyclone commented 3 years ago

Thank you, I'll try DO or Vultr... Maybe the iops and/or speed should be part of the minimum system requirements in the docs then...?

Unfortunately there's no direct formula to decide what is a minimum IOPS as it all depends on how complex your query is. E.g, doing a simple transaction lookup would be a lot quicker than doing a sum of all transaction outputs, which would mean someone might only need 5000 IOPS while someone else would need 30,000 IOPS.

IOPS is just another aspect of system design, however due to the diversity of Cardano it makes it more obvious.

CherryDT commented 3 years ago

Hm but here the issue is that the DB doesn't even get fully synced because of it, so the baseline of "no query at all yet" should probably have some number on it... (which I appear to be not reaching)

CyberCyclone commented 3 years ago

I wouldn't know what a minimum IOPS would be as I've always gone for systems that are above 65,000 IOPS. It could run on less, but I wouldn't know.

cardanians commented 3 years ago

I have now same issue.

Node is synced, only db-sync is actually ~40 minutes behind (not stucked, still inserting blocks, but slowly).

v10 bare metal, nvme, 48 cores, 50% cpu usage (total, so there is not issue with cpu overload)

isnt this related to reward calculations as we are in middle of epoch?

erikd commented 3 years ago

Node is synced, only db-sync is actually ~40 minutes behind (not stucked, still inserting blocks, but slowly).

So db-sync is 40 minutes behind the node for 40 minutes? How much behind?

cardanians commented 3 years ago

@erikd it was behind in time of writing 40 minutes behind node (avg 30 minutes) maybe total lagging for 4 hours. in some time 60 minutes behind node, later then 20 minutes behind node, avg maybe 30 minutes. now is it okay.

cardanians commented 3 years ago

@erikd btw is db-sync v10 compatible with cardano-node 1.29.0?

erikd commented 3 years ago

@cardanians Sorry, not sure.

stefanofranz commented 3 years ago

We are also encountering this issue - dbsync seems to be struggling to insert blocks at the same speed as the node is receiving them.

Are there any debugging logs we can turn on to discover the source of latency?

PG is co-located on the same physical machine as the node and dbsync. CPU usage < 30% RAM usage < 50% disk IO queue is very small, no spikes...

erikd commented 3 years ago

@stefanofranz Machine specs?

stefanofranz commented 3 years ago

@stefanofranz Machine specs?

Ryzen 5900, 64gb ram and 1tb 970 EVO plus SSD x2 in raid 1

pg.conf

# DB Version: 13
# OS Type: linux
# DB Type: dw
# Total Memory (RAM): 24 GB
# CPUs num: 4
# Connections num: 20
# Data Storage: ssd

max_connections = 20
shared_buffers = 6GB
effective_cache_size = 18GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 500
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 157286kB
min_wal_size = 4GB
max_wal_size = 16GB
max_worker_processes = 4
max_parallel_workers_per_gather = 2
max_parallel_workers = 4
max_parallel_maintenance_workers = 2
synchronous_commit = off
listen_addresses = '*'
erikd commented 3 years ago

Raid 1 is mirroring. I would be expect that to be slower. Would you be able to try this without mirroring?

stefanofranz commented 3 years ago

yes - I can turn it off, but it's very unlikely to be the problem - I've benchmarked PG and it's recording between 50-100K IOPS with latencies in < 1ms on average. The disk queue is effectively empty all the time.

It just seems that dbsync itself is running slow, is there a way I can print out the time it's taking to do PG ops? The next step will be to install monitoring in PG itself.

CyberCyclone commented 3 years ago

@stefanofranz Machine specs?

Ryzen 5900, 64gb ram and 1tb 970 EVO plus SSD x2 in raid 1

Is the pg.conf from this git setup or is this your own Postgres setup config? I haven't changed or seen db-sync configs at all so I'm not sure on what the defaults are, but max_connections = 20 seems small to me for any database.

stefanofranz commented 3 years ago

@stefanofranz Machine specs?

Ryzen 5900, 64gb ram and 1tb 970 EVO plus SSD x2 in raid 1

Is the pg.conf from this git setup or is this your own Postgres setup config? I haven't changed or seen db-sync configs at all so I'm not sure on what the defaults are, but max_connections = 20 seems small to me for any database.

it came from: https://pgtune.leopard.in.ua/#/ which has always worked very well - I set connections to 20 as it is ONLY used for dbsync

erikd commented 3 years ago

I have not seen this. Until someone figures out to reproduce it, I cannot debug it.

CyberCyclone commented 3 years ago

Has it been established that it is db-sync that is falling behind cardano-node? I'm wondering if it's a possibility that cardano-node is the one that's falling behind.

CherryDT commented 3 years ago

At least in my case I believed it to be db-sync because in the log there were messages that indicated new block X being the new tip accompanied by messages that indicated block X-80 being processed, and there were also PG processes using CPU

stefanofranz commented 3 years ago

yes just like with @CherryDT - my node is on say block X, db-sync says it's insertingShellyBlock X-500 or so, with the number being different at different times.

CherryDT commented 3 years ago

by the way my client is working on getting a Vultr account... I'll let you know once it's all set up whether it's now working well...

CyberCyclone commented 3 years ago

@stefanofranz Machine specs?

Ryzen 5900, 64gb ram and 1tb 970 EVO plus SSD x2 in raid 1

Is the pg.conf from this git setup or is this your own Postgres setup config? I haven't changed or seen db-sync configs at all so I'm not sure on what the defaults are, but max_connections = 20 seems small to me for any database.

it came from: https://pgtune.leopard.in.ua/#/ which has always worked very well - I set connections to 20 as it is ONLY used for dbsync

@stefanofranz When you say, only used for db-sync, do you have no applications connections to Postgres?

I can understand db-sync falling behind if it's fighting for a connection to Postgres while applications are hogging the connections. To me this is the most likely scenario in what's happening, especially if it changes throughout the day based on application peak loads from users.

If it's just sitting there and all it's doing is inserting a block every 20 seconds and it's somehow falling behind, that's a weird one.

CherryDT commented 3 years ago

On Vultr it works OK now, it seems.

I still think there should be a big warning about this issue at the place where the minimum specs are described (ideally also in the main cardano-graphql readme), stating that the latest version needs huge I/O resources and any regular configuration on AWS won't suffice even if CPU/RAM resources are well beyond the minimum, recommending a good configuration like Vultr bare metal as you did here. I imagine I'm not the only one who has their whole infrastructure set up on AWS (given it's a pretty popular service used by many large companies too) and had it working well there before until this update...

(To be honest I'm still surprised this is even such an issue, because I also have a Bitcoin node with ElectrumX on top which has many times larger a blockchain to cope with and it runs with a cheaper configuration than this just fine - with larger harddisk of course, but still... and we have a few other coins with similar setups too, like Dogecoin plus Blockbook, and I never encoutered any I/O bottlenecks, and until now all volumes were gp2 with default settings, I never had to tweak anything there. I'm curious then, is this sort of - expensive - resource demand really needed and justified? What will happen when Cardano gains more traction and grows many times larger? Will it still be feasible to even run such a service...?)

erikd commented 3 years ago

To be honest I'm still surprised this is even such an issue, because I also have a Bitcoin node with ElectrumX on top which has many times larger a blockchain

The bitcoin blockchain may be bigger in terms of bytes, but is probably not bigger in terms of number of blocks. BTC has a block every approx. 10 minutes, while Cardano has one every 20 seconds.

I suspect the system requirements of ElectrumX are significantly less than db-sync and PostgreSQL.

Regardless of all that, until we have a way of reproducing this issue, I cannot reproduce it.

CherryDT commented 3 years ago

I can give you access to the troubled install on the AWS server if that helps... You can find me in the member list of the Cardano Discord with the same username.

EDIT: Emailed you.

CyberCyclone commented 3 years ago

To be honest I'm still surprised this is even such an issue, because I also have a Bitcoin node with ElectrumX on top which has many times larger a blockchain

If the IOPS is the issues as it's appearing to be, then I'm not surprised and this more of an understanding on how IOPS has an impact on database performance in general and also how AWS and EBS works.

For a general overview for those that aren't aware, GP2 on AWS increases IOPS as you increase storage or you need to provision enough IOPS yourself to get to the same level as a direct access nvme drive.

Depending on the storage type, AWS also uses quota systems which might explain why it sometimes drops out. E.g, some EBS volumes give you credits to "burst" above the baseline for as long as you have credits. If Cardano has a large amount of transactions to process, it could start eating into the credits and if it runs out, it would then start to lag.

Here's how AWS provisions their IOPS: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html

Cmdv commented 8 months ago

It looks like there hasn't been any activity on this issue for a some time. We're closing it for now. If you're still experiencing this problem or have any updates, please reopen or create a new issue. Your feedback is valuable, thank you!