weiznich / diesel_async

Diesel async connection implementation
Apache License 2.0
660 stars 82 forks source link

MySQL: `Can't create more than max_prepared_stmt_count statements` after executing more than 16382 inserts #26

Closed gwy15 closed 2 years ago

gwy15 commented 2 years ago

diesel-async + AsyncMysqlConnection gives error after executing more than 16382 inserts, while diesel does not

Can't create more than max_prepared_stmt_count statements (current value: 16382)
outdated reprodicible example Hi. I'm setting up a really simple [template repo](https://github.com/gwy15/diesel-async-example) for my own use and I tried to [insert-select-delete](https://github.com/gwy15/diesel-async-example/blob/main/tests/insert_and_validate.rs#L51) concurrently. Specifically, I used 100 threads/futures, each insert-select-delete 200 times. The thread pool was set max connection count to 30. diesel-async gives the error, whilst disel(sync) does not and can finish as expected. ``` Can't create more than max_prepared_stmt_count statements (current value: 16382) ``` On my local machine they both reached about ~2k QPS. The concurrent connections was 30. reproducible repo (with corresponding failing CI): https://github.com/gwy15/diesel-async-example

Setup

Versions

Feature Flags

What are you trying to accomplish?

Since diesel + multithread does not have a problem, I expect diesel-async to have the same ability.

What is the actual output?

Can't create more than max_prepared_stmt_count statements

Are you seeing any additional errors?

No

Steps to reproduce

migration

CREATE TABLE `user` (
    `id`    bigint unsigned     NOT NULL AUTO_INCREMENT,
    `name`  varchar(255)        NOT NULL,
    `email` varchar(255)        NOT NULL,
    `created_at` datetime(3) NOT NULL DEFAULT now(3),
    `updated_at` datetime(3) NOT NULL DEFAULT now(3) ON UPDATE now(3),
    PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

main.rs

use diesel_async::{AsyncConnection, AsyncMysqlConnection, RunQueryDsl};

mod schema {
    diesel::table! {
        user (id) {
            id -> Unsigned<Bigint>,
            name -> Varchar,
            email -> Varchar,
            created_at -> Datetime,
            updated_at -> Datetime,
        }
    }
}

#[derive(diesel::Insertable)]
#[diesel(table_name = schema::user)]
pub struct NewUser<'a> {
    pub name: &'a str,
    pub email: &'a str,
}
impl<'a> NewUser<'a> {
    pub async fn insert(&self, conn: &mut AsyncMysqlConnection) -> anyhow::Result<()> {
        diesel::insert_into(schema::user::dsl::user)
            .values(self)
            .execute(conn)
            .await?;
        Ok(())
    }
}

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let url = dotenv::var("DATABASE_URL")?;

    let mut conn = diesel_async::AsyncMysqlConnection::establish(&url).await?;

    for i in 0..16382 + 10 {
        NewUser {
            name: &format!("name-{}", i),
            email: &format!("email-{}", i),
        }
        .insert(&mut conn)
        .await?;
        if i % 100 == 0 {
            println!("{}", i);
        }
    }
    Ok(())
}

cargo.toml

[dependencies]
anyhow = "1.0.65"
diesel = { version = "2.0.0", features = ["mysql", "chrono"] }
diesel-async = { version = "0.1.0", features = ["bb8", "mysql", "tokio"], optional = true }
dotenv = "0.15.0"
tokio = { version = "1.21.2", features = ["full"] }
outdated Please see my repo https://github.com/gwy15/diesel-async-example, and the CI workflow https://github.com/gwy15/diesel-async-example/actions/runs/3175713190 is exactly what I had on my local machine. to run locally, please ```bash git clone https://github.com/gwy15/diesel-async-example cd diesel-async-example docker-compose up -d diesel migration run cargo t --no-default-features --features async ```

Checklist

gwy15 commented 2 years ago

Update: concurrency is not a must-have to reproduce this problem (though will be more effient), I can reproduce with this simple code:

let pool = db::connect(&url).await?;
let mut conn = pool.get().await?;
for i in 0..16382 + 10 {
    let new = models::NewUser {
        name: &format!("name-{}", i),
        email: &format!("email-{}", i),
    };
    models::User::insert(new, &mut conn).await?;
}
weiznich commented 2 years ago

Thanks for writing this bug report. I cannot reproduce this behavior using mariadb 10.5.16 on linux, even with a much lower max_prepared_stmt_count. Can you add information about your environment? Especially important are the following information:

gwy15 commented 2 years ago

I used docker version of mysql, which is mysql:5.7. the max_prepared_stmt_count should be default value 16382 (https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_max_prepared_stmt_count).

Changing the value does not affect anything, but commenting this line does eliminate the error

https://github.com/weiznich/diesel_async/blob/main/src/mysql/mod.rs#L67

gwy15 commented 2 years ago

I can reproduce on mariadb on Windows via Docker (mariadb:latest)

version: '3.9'
services:
  mariadb:
    image: mariadb:latest
    ports:
      - "3306:3306"
    volumes:
      - ./maria-data:/var/lib/mysql
    environment:
      MARIADB_ROOT_PASSWORD: NNe47zF3JRhr6Ykb
      MARIADB_DATABASE: db
gwy15 commented 2 years ago

I can make a PR to fix this real quick by applying Queryable::close to dropping last_stmt, but I really appreciate if some questions can be explained to me:

weiznich commented 2 years ago

Why are we using our own statement caching mechanism instead of mysql_async's caching mechanism?

Because mysql_async statement caching mechanism is less powerful than what diesel offers. mysql_async basically caches any statement by using the sql string as cache key entry. Diesel on the other hand "knows" at compile time whether:

The first point let's us decide whether to cache or not to cache, while the second point allows us in some cases to just skip query building at all and use a compile time "id" to address the statement cache and receive an already prepared statement from there. Checkout this documentation for details. (The implementation in diesel_async closely mirrors the main diesel implementation and uses essentially the same query building infrastructure)

Is the AsyncMysqlConnection::last_stmt purely a workround so that the callback fn signature still gets a reference instead of a owned stmt? I guess something like a MaybeOwned is more intuitive from my personal perspective.

Yes that is purely a workaround to get things working. With the assumption that Statement::drop automatically drops the deallocates the statement as well, this seems like the "easy" solution. As demonstrated by this bug report that was wrong.

29 removes this hack and introduces "proper" deallocation of the corresponding statement in almost all cases. The only potential issue with that solution is around canceling the corresponding future. This will likely leak the specific Statement, but I do really not see any good solution there yet.