Open holiman opened 3 years ago
I added some printouts in session_compaction.go
leveldb: newCompaction. t0=1, t1=0 elems
leveldb: newCompaction: after expand, t0=2923417, t1=0 elems
leveldb: adding 2923417 iterators at level 0
So, for some reason the db has wound up in this state:
Level | Tables | Size(MB) | Time(sec) | Read(MB) | Write(MB)
-------+------------+---------------+---------------+---------------+---------------
0 | 2923421 | 5479128.78218 | 0.00000 | 0.00000 | 0.00000
-------+------------+---------------+---------------+---------------+---------------
Total | 2923421 | 5479128.78218 | 0.00000 | 0.00000 | 0.00000
We're investigating what may have caused this to happen.
When running go-ethereum as an archive node, the total amount of data stored in leveldb is extremely large. In this case, it contains roughly
5.4T
data, in ~3M
ldb-files, and the default size of2Mb
each. (Edit: 5.4Tb data is the correct number)When only just opening the database, do some reading, and closing it, the close-procedure takes upwards of 18 minuts to complete. However, the big problem is rather the memory consumption while this is happening.
And this appears to be due to the iterators used during table compaction
Example stack trace:
What is happening here, is that the
merged_iterator
callsFirst
on each of it's sub-iterators. I have not checked explicitly, but I assume that the sub-iterators (indexed_iterator
) are one per file, so roughly3M
of them.For each one,
First()
loads the block header. This consumes ~20G memory, which would suggest that each block header is ~6K (does that sound reasonable?).Now, from a higher level perspective, I can understand that the merged iterator does need to load the (first) keys of the sub-iterators, but it seems a bit problematic to load the full blockheader, via
indexedIterator.setData
, from a memory consumption perspective.I've tried some ways to see if I could hack around it, but neither seemed very feasible:
next
later on wants to loop over all keys anyway:indexed_iterator
. This doesn't quite fly, sinceFirst
then callsNext
, andNext
needs to actually load the data to figure out the first key. So the total effect is the same, unless a more thorough rewrite can be made.At this point, I'm curious if there's some other way to get around this. For example, would it be possible to make smaller compactions which do not cover the entire keyspace? Or compacting some "random" subset of the indexed iterators, a few at a time? Glad for any advice you can give -- I'm aware this isn't you full-time job, so thanks for even reading this :)