CoreyKaylor / Lightning.NET

.NET library for LMDB key-value store
Other
397 stars 82 forks source link

Issue after a large number of inserts #129

Closed darksody closed 3 years ago

darksody commented 3 years ago

Hello,

I am trying to insert a large number of about 20 million entries (unique domain and subdomain names). The value of each one is rather small (it's represented by 2 ints separated by comma with a max value of 100). So an entry looks like: key: "mydomain.com" value: "13,2"

I do UTF8 encoding before inserting them. The problem is that the insertion stops at some point, without giving me any error. After every 50k entries i insert, i do a query for the lmdb count so i check if i have inserted as expected. The count stops at around 3.7 million entries, while the put operations still go on.

I have to mention that the SIZE of the db i have set to 4GB, that should be enough (i have the same values in Redis, and Redis takes ~3GB RAM).

This is how i insert batches of 50k:

public void InsertBatch(Dictionary<string, string> entities)
        {
            using (var tx = _env.BeginTransaction())
            {
                using (var db = tx.OpenDatabase())
                {
                    foreach (var entity in entities)
                    {
                        tx.Put(db, Encoding.UTF8.GetBytes(entity.Key), Encoding.UTF8.GetBytes(entity.Value));
                    }
                    tx.Commit();
                }
            }
        }

And this is how i declare the _env:

public LmdbRepository(string folderPath, long dbSize = 0)
        {
            _folderPath = folderPath;
            _env = new LightningEnvironment(_folderPath);

            if (dbSize == 0)
            {
                _env.MapSize = 4 * 1024 * 1024 * 1024L; //4GB
            }
            else
            {
                _env.MapSize = dbSize;
            }
            _env.Open();
       }

I'm not sure if there's anything i'm missing at this point, but at the end, if i do a tx.GetEntriesCount(db); i only get ~3.7m instead of the 20m i'm inserting (and yes, i'm sure they are not duplicates).

AlgorithmsAreCool commented 3 years ago

This is interesting, let i'll take a look and get back to you shortly

AlgorithmsAreCool commented 3 years ago

@darksody btw, i assume you are using the newest version of the library, correct?

AlgorithmsAreCool commented 3 years ago

So far i can't replicate this. I've tested string keys up to 10 million entries. Working on a test for larger insertions now

AlgorithmsAreCool commented 3 years ago

I've tested 50,000,000 records so far and it seems to be working.

Can you try inspecting the return value of tx.Put to make sure it is Success everytime?

Aside from testing random key insertion, I'm not sure I can proceed without knowing more about your data or your environment. I'll try to use some keys that look more like your URL example.

darksody commented 3 years ago

Yes, i'm using the latest version. I think it's the nature of my keys. I tried with keys like "hello{i}" and value "world{i}" where i is my cursor from 1 to 20,000,000. With a 2gb file it stops at some point. I'll try tomorrow to test the return of tx.Put, i did not try that yet (frankly i didn't notice it returns a status, sorry).

AlgorithmsAreCool commented 3 years ago

I tested 50 million keys with the pattern hello_{i} and it still seems to work for me.

Event more interesting however is the performance difference between the two key patterns 🤔

darksody commented 3 years ago

I do get MDBResultCode.Success, which is weird, because it's definitely not adding anymore. Here's a pastebin of my simple test app (i limited the db size to 512mb so i get the limit faster): https://pastebin.com/fVHyCujx

AlgorithmsAreCool commented 3 years ago

Ok, i'm getting your results.

Added 7800000 / 20000000. Lmdb size: 7800000
Added 7850000 / 20000000. Lmdb size: 7850000
Added 7900000 / 20000000. Lmdb size: 7900000
Added 7950000 / 20000000. Lmdb size: 7950000
Added 8000000 / 20000000. Lmdb size: 7950000
Added 8050000 / 20000000. Lmdb size: 7950000
Added 8100000 / 20000000. Lmdb size: 7950000
Added 8150000 / 20000000. Lmdb size: 7950000
Added 8200000 / 20000000. Lmdb size: 7950000

Give me some time to poke at this...

AlgorithmsAreCool commented 3 years ago

Checking the return value of tx.Put shows

hello_7956899 Failed with BadTxn
hello_7956900 Failed with BadTxn
hello_7956901 Failed with BadTxn
hello_7956902 Failed with BadTxn
darksody commented 3 years ago

My bad, was checking the wrong result code. I checked the one from the Put operation, instead of the commit. Do you know what does BadTxn mean? Because if you considerately increase the db size, it will work and insert all of them.

AlgorithmsAreCool commented 3 years ago

I had this same error when I was testing yesterday, but I corrected it. Trying to remember how 😓...

drweeto commented 3 years ago

I tried the pastebin above. For what it's worth, I saw: the first Put that goes wrong in a batch returns MapFull, thereafter it returns BadTxn. The Commit always just returns BadTxn if the transaction has failed.

Is that what you're both seeing?

AlgorithmsAreCool commented 3 years ago

Ah! that might be it. So the console was scrolling so fast with errors I never saw the first one.

AlgorithmsAreCool commented 3 years ago

@darksody

Ok after more testing, I believe your issue you are hitting the MapSize(thanks @drweeto !).

The last few lines of my test app look like this

Wrote 100,000 / 15,100,000 in 81.571 milliseconds. Rate = 1,225,924.38
    Stats { MapSize = 1,073,741,824, AllocatedPages = 254,740, AllocatedSize = 1,043,415,040, MapUtilization = 97.18%, FreeSpace = 30,326,784 }
Wrote 100,000 / 15,200,000 in 83.141 milliseconds. Rate = 1,202,774.56
    Stats { MapSize = 1,073,741,824, AllocatedPages = 256,503, AllocatedSize = 1,050,636,288, MapUtilization = 97.85%, FreeSpace = 23,105,536 }
Wrote 100,000 / 15,300,000 in 89.548 milliseconds. Rate = 1,116,720.77
    Stats { MapSize = 1,073,741,824, AllocatedPages = 258,265, AllocatedSize = 1,057,853,440, MapUtilization = 98.52%, FreeSpace = 15,888,384 }
Wrote 100,000 / 15,400,000 in 88.774 milliseconds. Rate = 1,126,454.68
    Stats { MapSize = 1,073,741,824, AllocatedPages = 260,030, AllocatedSize = 1,065,082,880, MapUtilization = 99.19%, FreeSpace = 8,658,944 }
Key hello_15498527 failed with code MapFull

You can calculate statistics about each database in you environment with the following helper code. There is not currently a clean way of doing this for the entire environment.

        private static void PrintStats()
        {
            using var tx = LmdbEnv.BeginTransaction();
            using var db = tx.OpenDatabase("TestDb");

            var stats = db.DatabaseStats;

            var mapSize = LmdbEnv.MapSize;
            var allocatedPages = stats.LeafPages + stats.BranchPages + stats.OverflowPages;
            var allocatedBytes = allocatedPages * stats.PageSize;
            var freeBytes = mapSize - allocatedBytes;
            var utilization = allocatedBytes / (double)mapSize;

            var printStuff = new {
                MapSize = mapSize.ToString("N0"),
                AllocatedPages = allocatedPages.ToString("N0"),
                AllocatedSize = allocatedBytes.ToString("N0"),
                MapUtilization = utilization.ToString("P2"),
                FreeSpace = freeBytes.ToString("N0")
            };

            Console.WriteLine($"    Stats {printStuff}");
        }

I think this library can do better in a few ways

If you don't have any further questions about this issue, i'm going to close it.