gridcoin-community / Gridcoin-Research

Gridcoin-Research
MIT License
588 stars 173 forks source link

High memory consumption of headless client in mapBlockIndex #153

Closed Techbert08 closed 7 years ago

Techbert08 commented 7 years ago

I Dockerized the headless gridcoinresearchd daemon and BOINC with the intention of running them alongside an application I have on GCE as a way to use some excess capacity.

Unfortunately gridcoinresearchd absorbs all my memory and dies. I believe it fails to malloc while reading the leveldb blockindex into the global mapBlockIndex map at startup. I only have 256 MB available on this instance and no swap.

Is it reasonable to expect the client to operate in such a constrained memory environment, and is there value in making that possible? Given that leveldb already has this data, are there actually performance gains from dumping its contents into a global var rather than just reading from the database as needed?

grctest commented 7 years ago

Related: https://github.com/gridcoin/Gridcoin-Research/issues/140

I think 256MB is too low, mine works with 512MB but if you want to self-compile you'll need to setup swap space or temporarily up the resources to 1GB.

Techbert08 commented 7 years ago

To be clear, I'm not expecting to compile in that environment--just run. On #140 if there are high loads at startup that would explain why it doesn't start for me.

If the problem is likely to be a small number of big globals I'm happy to take a crack at removing them and send a pull request. If it's more likely to rabbit-hole, then I'll find another way to run the client.

Let me know or feel free to close if it sounds infeasible.

grctest commented 7 years ago

What would the impact be with removing these big globals?

gridcoin commented 7 years ago

Grctest, interesting logo, what is in it?

"These big globals": If you are referring to the mapBlockIndex, thats the vector the coin uses to memorize the block headers- and they are indexed so they can be accessed in order. The coin absolutely wont function without that. It cant keep track of total coins per address without it. This is the essence of running a full node. If you wanted to get around it you would need to run something like electrum or a web service that ultimately accesses a full node.

grctest commented 7 years ago

@gridcoin

Grctest, interesting logo, what is in it? It's the logo for Project Rain (extending DPOR to 36+ different cryptocurrency platforms).

Relevant links: https://github.com/grctest/project-rain-site https://steemit.com/gridcoin/@cm-steem/sneak-preview-of-project-rain-screenshots https://steemit.com/beyondbitcoin/@cm-steem/project-rain-update-regarding-development

Techbert08 commented 7 years ago

I poked it a bit, and I agree.

I tried getting rid of the "pprev" and "pnext" pointers and instead stored the hashes of those blocks. At places where those are accessed I did a GET request of the database for the key. That gets the working set size down because the keys are loaded only when needed and freed when not. However, it makes scanning along the chain a lot of little reads and means the many callers have to be updated,

Converting the bare pointers to boost::shared_ptr helps, but it's still a large change. I'll go ahead and close this since it's probably not worth doing.

denravonska commented 7 years ago

An idea could be to replace large vectors with deque which does not have the contiguous memory requirement. That could make it easier to run under low memory conditions where a large memory block might not be available and swap has to be used.

Techbert08 commented 7 years ago

That would probably help, too. The data structure I mentioned heap allocates a bunch of objects in a map container and also wires them together into a linked list. I ran into difficulty because there is a lot of global state and several modules from an early Bitcoin client that may not actually be used in Gridcoin. I think some basic refactoring would make this change much easier if further divergence from Bitcoin's code structure is OK.

Would code cleanup advance the actual long-term needs of the project, or are there other more important places that need contributions?

denravonska commented 7 years ago

I have two proposal fixes which would lessen the RAM consumption of the map.

1) There are 3 unused variables in CBlockIndex; nLastPORBlock, sLastPORBlockHash and nTotalPORPayments. These can currently safely be removed to reduce the size of the CBlockIndex struct from 408 to 360 bytes. The overall memory consumption went down for me from peak 453MiB to 417MiB.

2) CBlockIndex::nIsSuperBlock and CBlockIndex::nIsContract can be merged into the flags variable nFlags. This change only reduces the struct size by 8 bytes (on my system) and the fix is more complicated since the current flags are already serialized to disk. This only drops the RAM usage by 7MiB

With both changes there is a RAM reduction of around 9.5% on systems where intis 32bit. If you want to I can push my changes to a branch and reference this issue so you can have a look.

Techbert08 commented 7 years ago

The first sounds well worth doing. The second seems like a good bit of trouble for a small gain. Maybe it could be rolled into a future update that changes the storage format?

denravonska commented 7 years ago

Weird, the fix is incorrect on github but fine locally but incorrect when pushed to github. Compare https://github.com/denravonska/Gridcoin-Research/commit/50a405f579c9e00a62f60d2f162ebfeb213aa3f5 to https://gist.github.com/denravonska/e0c76cf3582976692499036cbc138076.