surrealdb / surrealdb.net

SurrealDB SDK for .NET
https://surrealdb.com
Apache License 2.0
93 stars 18 forks source link

Potential performance problem #104

Closed Thomasparsley closed 4 months ago

Thomasparsley commented 4 months ago

For some reason when I do an HTTP query to get all data from a table that contains 3 records the average response is 50ms.

But when I do the query directly using the HTTP client (Insomnia) the query takes under 1ms. Even in python library query using WS takes under 1ms.

For both HTTP and WS client the query duration is same.

      Sending HTTP request GET http://127.0.0.1:8000/key/migrations
info: System.Net.Http.HttpClient.[SurrealDB] http://127.0.0.1:8000/.ClientHandler[101]
      Received HTTP response headers after 44.1242ms - 200
info: System.Net.Http.HttpClient.[SurrealDB] http://127.0.0.1:8000/.LogicalHandler[101]
      End processing HTTP request after 44.2145ms - 200

os: macOS 14.5 chipset: Apple M3 Pro 18gb docker: 4.30.0 surreal: 1.5.2 dotnet: 8.0.302 surreal.net: 0.5.0

Odonno commented 4 months ago

Hello Thomas,

50ms is not an eternity but it still feels very suspicious indeed.

  1. Let me just start that benchmarks are run daily and the last one (you can see here) that run gives me the following:
Method Runtime Mean Error StdDev Gen0 Allocated
Http NativeAOT 8.0 18.522 ms 0.2651 ms 0.2479 ms - 259.16 KB
HttpWithClientFactory NativeAOT 8.0 18.527 ms 0.2570 ms 0.2279 ms - 262.46 KB
WsText NativeAOT 8.0 10.812 ms 0.1067 ms 0.0998 ms 15.6250 2015.37 KB
Http .NET 8.0 18.865 ms 0.3771 ms 0.4191 ms - 289.67 KB
HttpWithClientFactory .NET 8.0 18.873 ms 0.2826 ms 0.2643 ms - 295.26 KB
WsText .NET 8.0 9.100 ms 0.0882 ms 0.0782 ms - 2011.19 KB

Let's concentrate on the .NET 8.0 runtime for now. The WsText protocol is the fastest one and it gets 1_000 records in less than 10ms. Details of the machine where the test ran:

BenchmarkDotNet v0.13.10, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
AMD EPYC 7763, 1 CPU, 4 logical and 2 physical cores
.NET SDK 8.0.302
  [Host]     : .NET 8.0.6 (8.0.624.26715), X64 RyuJIT AVX2
  Job-BVNALS : .NET 8.0.6, X64 NativeAOT AVX2
  Job-GMCZFX : .NET 8.0.6 (8.0.624.26715), X64 RyuJIT AVX2

I assume your machine is even more performant than this one.

  1. I can see that you are running an ASP.NET Core app. So, is 50ms the time the library spent to retrieve and deserialize the response, or is it the total time of the HTTP server request? Note that an ASP.NET server can add overhead (middleware, logging, etc..).

  2. If this is indeed the total time of the server request, you can use a Stopwatch around the Select method to measure the (almost exact) time the .NET SDK take to get and the result.

Thomasparsley commented 4 months ago

Of course 50ms is not forever. It just struck me that the same query SELECT * FROM migrations in surrealist.app takes +- 0.5ms and in dotnet8 100x more.

The 50ms is measured with Stopwatch and in python time.time directly on the Query method. Added log from console approximately the same with Stopwatch measurement.

Odonno commented 4 months ago

This is weird. Do you have a reproduction program to share?

Thomasparsley commented 4 months ago
using System.Diagnostics;
using System.Text.Json;

using SurrealDb.Net;

var builder = WebApplication.CreateBuilder(args);

var url = Var("SURREAL_URL") ?? "127.0.0.1:8000";
var protocol = Var("SURREAL_PROTOCOL") ?? "ws";
var suffix = protocol.StartsWith("ws") ? "/rpc" : "";
var endpoint = $"{protocol}://{url}{suffix}";

builder.Services.AddSurreal(
    SurrealDbOptions
            .Create()
            .WithEndpoint(endpoint)
            .WithNamespace(Var("SURREAL_NAMESPACE") ?? "default_ns")
            .WithDatabase(Var("SURREAL_DATABASE") ?? "default_db")
            .WithUsername(Var("SURREAL_USERNAME") ?? "root")
            .WithPassword(Var("SURREAL_PASSWORD") ?? "root")
            .Build(),
    lifetime: ServiceLifetime.Scoped,
    configureJsonSerializerOptions: options => {
        options.PropertyNamingPolicy = JsonNamingPolicy.SnakeCaseLower;
    }
);

var app = builder.Build();

app.MapGet("/", () => "Hello World!");

app.MapGet("/AllMigrations", async (ISurrealDbClient client, CancellationToken ct) => {
    var stopwatch = new Stopwatch();
    stopwatch.Start();
    var response = await client.Query($"SELECT * FROM migrations", ct);
    stopwatch.Stop();

    Console.WriteLine($"Ellapsed on AllMigrations query: {stopwatch.ElapsedMilliseconds}ms");
});

app.MapGet("/GetCompletedMigrations", async (ISurrealDbClient client, CancellationToken ct) => {
    var stopwatch = new Stopwatch();
    stopwatch.Start();
    var response = await client.Query($"SELECT VALUE name FROM migrations WHERE completed = true", ct);
    stopwatch.Stop();

    Console.WriteLine($"Ellapsed on GetCompletedMigrations query: {stopwatch.ElapsedMilliseconds}ms");

    stopwatch = new Stopwatch();
    stopwatch.Start();
    var names = response.GetValues<string>(0);
    stopwatch.Stop();

    Console.WriteLine($"Ellapsed on GetValues<string> for GetCompletedMigrations: {stopwatch.ElapsedMilliseconds}ms");

    return names;
});

app.Run();

static string? Var(string varName) {
    return Environment.GetEnvironmentVariable(varName);
}
Odonno commented 4 months ago

Ah, thank you for taking the time to make a reproduction example.

I can see 2 major performance limitations. This first one made me jump from 30ms to 3ms. The second one down to 1ms. I'll explain.

  1. The main pain point if I should say so is that you set lifetime: ServiceLifetime.Scoped, which mean that every time you trigger an endpoint, it will create a new instance of the client. When you create a new client, it is not connected to the database, so you lost a lot of time not only to create a new client but also to create a new Socket connection (whether HTTP or WS). I can see that you do not depend on authentication feature (i.e. per user query), so setting ServiceLifetime.Singleton would be better for your use case.

I am currently working on a way to consume a pool of clients, so that you do not have to recreate a client and better yet not to reconnect on each instance injection. So, I'd recommend to use Singleton (if you can) for the best performance until this feature is available.

  1. You are setting configureJsonSerializerOptions to set the JsonNamingPolicy. As of now, this factory will create a new JsonSerializer each time you make a request. And creating a JsonSerializer is costly. I already implemented a cache to remove this performance penalty. It will come soon.

Hope that would be helpful to you. Feel free to ask any question.

Thomasparsley commented 4 months ago

That makes sense. I thought client pooling was implemented, so I automatically chose scoped lifetime, but singleton is the way to go for me too.

Unfortunately, I removed all the converters I have implemented for Strong Typed IDs from this demo. So I'll wait.

I'm closing this issue now that penalties are already addressed. Thanks for the explanation and I look forward to the update!

P.S. Wouldn't it be useful to note that for now there is no client pooling and JsonSerializer is not cached?