alliedmodders / sourcemod

SourceMod - Source Engine Scripting and Administration
http://www.sourcemod.net/
985 stars 423 forks source link

Threaded queries hold an IDatabase handle while queued #1505

Open dragokas opened 3 years ago

dragokas commented 3 years ago

Help us help you

Environment

Description

I do like to make 50000 single queries to a local DB, but I'm receiving handles leak. What am I doing wrong?

According to docs1 & docs2, cloned handles are automatically closed after callback of threaded operation is executed.

Problematic Code (or Steps to Reproduce)

#pragma semicolon 1
#pragma newdecls required

#include <sourcemod>

#define MY_TABLE    "My_Table"

Database g_hDB;

public void OnConfigsExecuted()
{
    if( !g_hDB )
    {
        DB_Connect();
    }
}

void DB_Connect()
{
    char error[256];
    g_hDB = SQLite_UseDatabase("sourcemod-local", error, sizeof(error));
    OnDBConnect(g_hDB, error, 0);
}

public void OnDBConnect(Database db, const char[] error, any data)
{
    if( db == null || error[0] )
    {
        SetFailState("OnDBConnect: %s", error);
        return;
    }
    CreateTables();
}

void CreateTables()
{
    char query[640];
    FormatEx(query, sizeof(query), "CREATE TABLE IF NOT EXISTS `" ... MY_TABLE ... "` (\
        `lang` INTEGER UNSIGNED NOT NULL default 0, \
        `phrase` VARCHAR(%i) NOT NULL, \
        `translation` VARCHAR(%i) NOT NULL, \
        PRIMARY KEY ( lang, phrase) );", 128, 4096);

    g_hDB.Query(SQL_Callback_TableCreate, query);
}

public void SQL_Callback_TableCreate(Database db, DBResultSet hQuery, const char[] error, any data)
{
    if (!db || !hQuery) { LogError(error); return; }

    g_hDB.SetCharset("utf8");

    Test();
}

void Test()
{
    CreateTimer(0.2, Timer_Query, _, TIMER_REPEAT);
}

public Action Timer_Query(Handle timer)
{
    const int chunks = 1000;

    static int times;
    ++ times;
    if( times * chunks > 50000 ) return Plugin_Stop;

    for( int i = 0; i < chunks; i++ )
    {
        SQL_Insert(0, "key", "value");
    }
    PrintToServer("queries = %i", times * chunks);

    return Plugin_Continue;
}

bool SQL_Insert(int lang_num, char[] phrase, char[] translation)
{
    char query[1000];

    g_hDB.Format(query, sizeof(query), "REPLACE INTO `" ... MY_TABLE ... "` (lang, phrase, translation) \
        VALUES (%i, '%s', '%s');", lang_num, phrase, translation);

    g_hDB.Query(SQL_Callback, query);
}

void SQL_Callback(Database db, DBResultSet hQuery, const char[] error, any data)
{
    if (!db || !hQuery) { LogError(error); return; }
}

Logs

L 06/21/2021 - 00:07:46: [SM] Exception reported: Memory leak
L 06/21/2021 - 00:07:46: [SM] Blaming: TestDB.smx
L 06/21/2021 - 00:07:46: [SM] Call stack trace:
L 06/21/2021 - 00:07:46: [SM]   [0] Database.Query
L 06/21/2021 - 00:07:46: [SM]   [1] Line 87, H:\_To_games\Left4Dead_2\My_mods\MapChanger\MapChanger\Localize\TestDB.sp::SQL_Insert
L 06/21/2021 - 00:07:46: [SM]   [2] Line 73, H:\_To_games\Left4Dead_2\My_mods\MapChanger\MapChanger\Localize\TestDB.sp::Timer_Query
L 06/21/2021 - 00:07:46: [SM] MEMORY LEAK DETECTED IN PLUGIN (file "TestDB.smx")
L 06/21/2021 - 00:07:46: [SM] MEMORY LEAK DETECTED IN PLUGIN (file "TestDB.smx")
L 06/21/2021 - 00:07:46: [SM] Unloading plugin to free 31784 handles.
L 06/21/2021 - 00:07:46: [SM] Unloading plugin to free 31784 handles.
L 06/21/2021 - 00:07:46: [SM] Contact the author(s) of this plugin to correct this error.
L 06/21/2021 - 00:07:46: [SM] Contact the author(s) of this plugin to correct this error.
L 06/21/2021 - 00:07:46: --------------------------------------------------------------------------
L 06/21/2021 - 00:07:46: --------------------------------------------------------------------------
L 06/21/2021 - 00:07:46: Type   IDatabase           |   Count   31783
L 06/21/2021 - 00:07:46: Type   IDatabase           |   Count   31783
L 06/21/2021 - 00:07:46: Type   Timer               |   Count   1
L 06/21/2021 - 00:07:46: Type   Timer               |   Count   1
L 06/21/2021 - 00:07:46: --------------------------------------------------------------------------
L 06/21/2021 - 00:07:46: --------------------------------------------------------------------------
L 06/21/2021 - 00:07:46: Oldest Living Handle: Timer created at 06/21/2021 - 00:07:40
L 06/21/2021 - 00:07:46: Oldest Living Handle: Timer created at 06/21/2021 - 00:07:40
L 06/21/2021 - 00:07:46: Newest Living Handle: IDatabase created at 06/21/2021 - 00:07:46
L 06/21/2021 - 00:07:46: Newest Living Handle: IDatabase created at 06/21/2021 - 00:07:46
L 06/21/2021 - 00:07:46: -- Approximately 0 bytes of memory are in use by (31784) Handles.

L 06/21/2021 - 00:07:46: -- Approximately 0 bytes of memory are in use by (31784) Handles.

L 06/21/2021 - 00:07:46: [SM] Exception reported: Memory leak
L 06/21/2021 - 00:07:46: [TestDB.smx] Could not alloc handle
dvander commented 3 years ago

I can guarantee this is due to a silly mechanism in SourceMod. You're allowed to close the handle while the query is in-flight. To make this work, we duplicate the handle while it's in the thread queue. So effectively here you have 50,000 handles.

We should check to see if there are any compatible handles in the thread queue and re-use them before duplicating.

KyleSanderson commented 3 years ago

The actual issue here is some artificial delay that was added because there were people with bad disks. Specifically, on Linux, SQLite was so effective at increasing the disk utilisation that people were unable to hit the 15ms budget before the next frame. This became more apparent after the (much improved) threading work was added for queries. This work removed the previous even longer delay, which would cause clientprefs and other plugins to not get data on level change for minutes. Additionally, with this usecase, it wasn't even possible until the handle limit was moved from 16k to 64k which was done for TF2Items.ext.

https://github.com/alliedmodders/sourcemod/commit/0f6063af60bddd0793669f44553830762a843ce7#diff-9599b74620d0194cc5b84729765405b298e6f4134f22f2170e56fc11fd7fa550

While it isn't used often, I believe in this case you should actually be able to specify multiple values for your REPLACE (I believe this query can be much more effecient outside of this) which will greatly improve the through of the driver in general, our database system, and finally your plugin.

KyleSanderson commented 3 years ago

https://stackoverflow.com/a/5009740

You might be limited to 500 values in a single statement, or 1000000 bytes, but that could give you atleast a 500x performance improvement from a SM point of view.

Wend4r commented 3 years ago

https://stackoverflow.com/a/5009740

You might be limited to 500 values in a single statement, or 1000000 bytes, but that could give you atleast a 500x performance improvement from a SM point of view.

I can also suggest using Transactions so that SM doesn't clone the Database handles many times.

dragokas commented 3 years ago

I can also suggest using Transactions so that SM doesn't clone the Database handles many times.

@Wend4r, thanks. At first glance it looks like also the leak, when get called with 50000 q/per single transaction.

L 06/21/2021 - 20:06:50: Type   IQuery              |   Count   31777
L 06/21/2021 - 20:06:50: Type   IQuery              |   Count   31777
L 06/21/2021 - 20:06:50: Type   IDatabase           |   Count   2
L 06/21/2021 - 20:06:50: Type   IDatabase           |   Count   2

However, if we decrease N of queries per transaction to be ~ <= 31770 (even if total N of queries per same 1 frame become > 50000), no errors happen, and no leaks displayed in handles dump. So, with > 32000 q/per transaction, that's much like an overflow.

Here is a good working sample (for history):

public Action Timer_Transaction(Handle timer)
{
    const int tr_count = 5;
    const int chunks = 10000; // should be < 31000

    char key[] = "key";
    char value[] = "value";

    char query[2*(sizeof(key)+sizeof(value)+100)+1];

    g_hDB.Format(query, sizeof(query), "REPLACE INTO `" ... MY_TABLE ... "` (lang, phrase, translation) \
        VALUES (%i, '%s', '%s');", 0, key, value);

    Transaction tx;

    for( int j = 0; j < tr_count; j ++ )
    {
        tx = new Transaction();

        for( int i = 0; i < chunks; i++ )
        {
            tx.AddQuery(query);
        }
        g_hDB.Execute(tx, SQL_Tx_Success, SQL_Tx_Failure);
    }
    PrintToServer("total queries: %i", tr_count * chunks);

    return Plugin_Stop;
}
public void SQL_Tx_Success(Database db, any data, int numQueries, DBResultSet[] results, any[] queryData)
{
    PrintToServer("TX Success");
}
public void SQL_Tx_Failure (Database db, any data, int numQueries, const char[] error, int failIndex, any[] queryData)
{
    PrintToServer("TX Failed");
    LogError(error);
}

@KyleSanderson, thanks for your suggestions and your time.

Surely you didn't know my real use case, which is:

You might be limited to 500 values in a single statement, or 1000000 bytes, but that could give you atleast a 500x performance improvement from a SM point of view.

So, that is also a possible solution, but in a cost of:

And I still need to know - what is a MAX number of simultaneous queries per frame are allowed without handles leaking. (because even those long query is not enough to fit my needs with one sigle call).

All at all, the above method was very useful for my other project. So, thanks again!

dragokas commented 3 years ago

hmmm, looks like

        for( int i = 0; i < chunks; i++ )
        {
            tx.AddQuery(query);
        }

also required additional dynamic memory, but I've no idea how to calculate it properly. Sorry for off-topic.

UPD. Well, it seems if I split transactions by much little chunks, like:

const int tr_count = 25;
const int chunks = 2000;

pragma dynamic is not required anymore.

dvander commented 3 years ago

@KyleSanderson the frame budget has no relation to this bug at all. As mentioned, the bug is handle duplication.

dvander commented 3 years ago

https://cs.alliedmods.net/sourcemod/source/core/logic/smn_database.cpp#203

This is the culprit, but I don't have a quick fix. The handle needs to be shared or re-used in some way rather than re-created for every op.

KyleSanderson commented 3 years ago

https://cs.alliedmods.net/sourcemod/source/core/logic/smn_database.cpp#203

This is the culprit, but I don't have a quick fix. The handle needs to be shared or re-used in some way rather than re-created for every op. @KyleSanderson the frame budget has no relation to this bug at all. As mentioned, the bug is handle duplication.

Not true (unless if we're leaking this - I don't think we are). Each query takes 2x handles as indicated (so yes, there's this issue). The actual issue is the artificial frame delay at 20ms which coupled with an unknown DBI exec time doesn't give a definite amount of queries per frame. A 66 tick server can only run a frame within 15ms, so at 20ms it's guaranteed to skip a frame at the very least. Even if we change this it's moving the goal post again (which before was 8k, now it's 32k because they're blowing the (already too high) 64k budget).

Surely you didn't know my real use case, which is: reading and parsing huge "resource" file(s) line by line saving key-value of each line into DB.

OK so don't query per line...

So, that is also a possible solution, but in a cost of: additional memory more CPU time consuming (due to my use case) to construct such a long string for query, e.g. with StrCat() the requirement to increase pragma dynamic for declaring large char[], which is not desirable since I'm coding inc-file, and those limit can be accidentally re-defined in the main sp by other devs.

Also not true. Remember you can pass an index into any String just like C. This skips all reformatting, recopying, and is a direct write to memory. if (sizeof (sKeyBuffer) - iLength < strlen(elements)) submit your query, if it isn't then keep adding Values to the query.

        iLength = FormatEx(sKeyBuffer, sizeof(sKeyBuffer), "Target: %N\n", iTarget);
        if(iButtons & IN_FORWARD)
            iLength += strcopy(sKeyBuffer[iLength], (sizeof(sKeyBuffer) - iLength), "     W     ");
        else
            iLength += strcopy(sKeyBuffer[iLength], (sizeof(sKeyBuffer) - iLength), "     -     ");

Anyway, there's other core things that can be done to improve this, but ultimately this is a script error and there are massive performance improvements that can be done by writing much more efficient code.

dragokas commented 3 years ago

Also not true. Remember you can pass an index into any String just like C.

Well, you better know. Of course, if that cost an effort, if you mean by that having noticeably more performance in the second thread by the cost of wasting minor operations with memory & CPU time in the main thread which much more significant. Thanks for code. I thought StrCat doing the same. I'll take a note that is not.

dvander commented 3 years ago

Kyle the frame timing is completely unrelated. Please understand this. It has nothing to do with this bug.