mbdavid / LiteDB

LiteDB - A .NET NoSQL Document Store in a single data file
http://www.litedb.org
MIT License
8.36k stars 1.22k forks source link

[BUG] Performance degradation in v5.0.18 #2451

Open alexbereznikov opened 3 months ago

alexbereznikov commented 3 months ago

Version v5.0.18, v5.0.19

Describe the bug Upsert performance degraded in ~14 times in v5.0.18 when compared to 5.0.17

Code to Reproduce Running this snippet with v5.0.17 and 5.0.18 is enough to see the issue. On my machine, 5.0.17 runs in 12 seconds, and 5.0.18 runs in 2m 50 seconds.

internal class Program
{
    private const string DbName = "test.db";
    private const int Records = 100_0000;
    private const int Chunk = 1000;

    static void Main(string[] args)
    {
        if (File.Exists(DbName))
        {
            File.Delete(DbName);
        }

        using (var db = new LiteDatabase(
            new ConnectionString
            {
                Filename = DbName,
            }))
        {
            var collection = db.GetCollection<TestRecord>();
            collection.EnsureIndex(x => x.Index);

            var data = GenerateData();

            var sw = Stopwatch.StartNew();

            foreach (var chunk in data)
            {
                collection.Upsert(chunk);
            }

            Console.Write(sw.Elapsed);
        }
    }

    private static IReadOnlyCollection<IReadOnlyCollection<TestRecord>> GenerateData()
    {
        var records = new List<TestRecord>(Records);

        for (var i = 0; i < Records; i++)
        {
            records.Add(new TestRecord
            {
                Index = i,
                First = Random.Shared.Next(),
                Second = Random.Shared.Next(),
            });
        }

        return records.Chunk(Chunk).ToArray();
    }

    private sealed class TestRecord
    {
        public int Index { get; init; }

        public int First { get; init; }

        public int Second { get; init; }
    }
}

Expected behavior Expected behavior is that performance in v5.0.18 is more or less the same as in v5.0.17

Screenshots/Stacktrace N/A

Additional context I haven't experimented much, upserting in chunks because that's what we do in our app, maybe chunk size matters, maybe not. Also haven't tried other operations.

alexbereznikov commented 3 months ago

So I decided to profile this repro to understand what leads to this performance degradation. Profiling for 5.0.19 gives the following results: image As we can see, CommitAndReleaseTransaction takes 4x more time than upsert itself, and for some reason we're doing writing to disk in the same thread.

And here is how it looks for v5.0.17: image As we can see, in v5.0.17 CommitAndReleaseTransaction takes little to no time, and all the writing is offloaded to a separate thread.

After some investigation, turns out that there is an issue in AsyncManualResetEvent - when Set is being called, execution continues in DiskWriterQueue.ExecuteQueue after await _queueHasItems.WaitAsync(). I would say, we don't need this separate class at all, as .NET provides a way to await any WaitHandle using thread pool (see https://github.com/StephenCleary/AsyncEx/blob/v5.0.0/src/Nito.AsyncEx.Interop.WaitHandles/Interop/WaitHandleAsyncFactory.cs, DoFromWaitHandle). So I changed that event to ManualResetEventSlim in conjunction with this method.

Now repro runs in 42 seconds. 4x faster than 5.0.19 but still ~3x slower than 5.0.17. Further profiling results: image As we can see, for some reason we do call BasePage.ToString a lot, as well as formatting strings. I'm 100% sure that this should not happen on the hot path. Here is the issue (BasePage.Get):

ENSURE(this.IsValidPos(position), $"invalid segment position in index footer: {ToString()}/{index}");
ENSURE(this.IsValidLen(length), $"invalid segment length in index footer: {ToString()}/{index}");

So I just commented these 2 lines to check and 1 in IndexService, and it runs in 13 seconds, pretty close to v5.0.17.

alexbereznikov commented 3 months ago

Created PR to fix the first part - #2453

alexbereznikov commented 3 months ago

Created PR to fix the second part - #2454

alexbereznikov commented 3 months ago

@mbdavid Sorry but I believe this is a critical issue, in the current state v5.0.19 is not usable. I did all the research and fixes, please consider merging it.

viveleltsi commented 3 months ago

I agree. This fix will help me too. thank you for your work @alexbereznikov . I hope it will be merged

alexbereznikov commented 3 days ago

2515 should fix remaining issues here