launchbadge / sqlx

🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, and SQLite.
Apache License 2.0
13.18k stars 1.24k forks source link

sqlx fetch_all for huge amount of data much slower then tokio-postgres #2007

Open dragonnn opened 2 years ago

dragonnn commented 2 years ago

I recently started using sqlx for querying huge amount of data, about 12096001 rows x 6 columns. That is running on postgresql with timescaledb on top of it. Those are 0.1s samples for two weeks for vibration sensors.

Running EXPLAIN ANALYZE on my query did show much lower query time that I was getting in my program using sqlx. So I decided to test out tokio-postgres. I found out that tokio-postgres is almost 50% faster then sqlx with mapping tokio-postgres Row type to the same struct sqlx is querying to. Exact numbers:

tokio_postgres_query_time: 9386ms tokio_postgres_query_time with map: 13028ms sqlx_query_time query_as: 29830ms sqlx_query_time query: 29205ms

with map is with maping tokio-postgres output to my struct, still much faster. This is with tokio-postgres running the query first, sqlx second, so the db might cache something already for sqlx? I know sqlx doesn't focus on performance, but I always assumed that is mostly about throughput for running multiple queries fast, not for a single fetch_all query. I also tested without using query_as but the times are almost the same, so the bottleneck isn't in mapping structs. My Rust struct:

#[derive(FromRow, Encode, Debug, Serialize, Deserialize, PartialEq, Clone, pg_mapper::TryFromRow)]
pub struct Data {
    pub datetime: chrono::NaiveDateTime,
    pub measuring_result: f64,
    pub alert_alarm: bool,
    pub danger_alarm: bool,
    pub work_state_id: Option<i32>,
    pub device_uuid: Uuid,
}

My table:

CREATE TABLE vibrodetectors.data (
    datetime timestamp NOT NULL,
    measuring_result double precision NOT NULL,
    alert_alarm bool NOT NULL,
    danger_alarm bool NOT NULL,
    device_uuid uuid NOT NULL,
    work_state_id integer,
    CONSTRAINT data_pk PRIMARY KEY (datetime,device_uuid)

);

And the query itself:

SELECT * FROM vibrodetectors.data WHERE device_uuid = ($1) AND (datetime >= $2 AND ($3 OR datetime <= $4)) ORDER BY datetime ASC

I don't think I need to paste the rust code itself, but I do have plans to take the code out of my project and create a minimum viable example for testing and getting some flamegraphs. For now I unfortunately need to ingrate tokio-postgres into my project and use it for getting the huge data out of the db. This was all done under sqlx 6.0 and tokio-postgres 0.7.6

abonander commented 2 years ago

FromRow is going to be problematic because it's doing a hashmap lookup for every column of every row. I have some thoughts on how to improve that (doing the lookup once per query and caching the results) but that's going to have to be part of 0.7.0 since I'll need to make breaking changes to the trait.

For now, either using query_as!() or a handwritten mapper that looks up values by index instead of by name might perform better.

dragonnn commented 2 years ago

This test where done with query_as!(), the FromRow is only for other part of my application with do use it. I should post the rust code 😅, also as you see I tested query!() too and they was only a really small difference. This is my sqlx code:

let data = sqlx::query_as!(
                Self,
                "SELECT * FROM vibrodetectors.data WHERE device_uuid = ($1) AND (datetime >= $2 AND ($3 OR datetime <= $4)) ORDER BY datetime ASC",
                device_uuid,
                range.0,
                range.1.is_none(),
                range.1.unwrap_or_else(|| chrono::Local::now().naive_local())
            )
            .fetch_all(pool)
            .await?

And this is tokio-postgres

let data = pool
                .query(
                    "SELECT * FROM 
                                vibrodetectors.data 
                                    WHERE 
                                        device_uuid = ($1) AND 
                                            (datetime >= $2 AND 
                                            ($3 OR datetime <= $4)) 
                                        ORDER BY datetime ASC",
                    &[
                        &device_uuid,
                        &range.0,
                        &range.1.is_none(),
                        &range.1.unwrap_or_else(|| chrono::Local::now().naive_local()),
                    ],
                )
                .await?.into_iter().map(|r| r.try_into().unwrap()).collect();

So no, unfortunately this difference isn't from FromRow, sorry about the confusion.

dragonnn commented 2 years ago

@abonander I created a bench repo for this issue https://github.com/dragonnn/sqlx_fetch_all_bench It can be run with just docker-compose up, note only that it does take about ~3GB of free space to create the sample data and the bench itself does run for a signification time with high system load. On my machine I got those results:

sqlx_query_as           time:   [43.255 s 43.402 s 43.255 s]
sqlx_query              time:   [43.888 s 44.305 s 44.750 s]
tokio_postgres          time:   [12.301 s 12.327 s 12.436 s]
tokio_postgres_map      time:   [16.386 s 16.489 s 16.598 s]

I know 10 samples aren't much but they are consistent

abonander commented 2 years ago

By profiling https://github.com/diesel-rs/diesel/tree/master/diesel_bench I figured out that we're doing 100x the read calls that tokio-postgres does, due to the following routine being called from lots of different places with very small byte-counts: https://github.com/launchbadge/sqlx/blob/main/sqlx-core/src/io/buf_stream.rs#L139

I believe the refactors that added this routine were more concerned with cancellation-safety than performance, and so didn't consider the ramifications of making lots of small read() calls all the time.

Ironically, using TLS probably alleviates the issue somewhat as it introduces additional buffering a level below this.

dragonnn commented 2 years ago

Thanks! I suspect that it might be something like that. Any chance fixing it without compromising cancellation-safety? unfortunately using tls is just often not necessary and unnecessary pain to setup it.

abonander commented 2 years ago

My next project (hopefully) is to refactor the drivers so that the runtime and TLS features can be orthogonal, and part of that would include fixing this while improving cancellation-safety in general.

dragonnn commented 1 year ago

My next project (hopefully) is to refactor the drivers so that the runtime and TLS features can be orthogonal, and part of that would include fixing this while improving cancellation-safety in general.

I did retest my bench with 0.7:

sqlx_query_as           time:   [26.239 s 26.252 s 26.239 s]
sqlx_query              time:   [26.279 s 26.292 s 26.350 s]
tokio_postgres          time:   [12.207 s 12.223 s 12.229 s]
tokio_postgres_map      time:   [15.997 s 16.026 s 16.071 s]

That is almost 50% faster, so much better. Still fails slight behind tokio_postgres, any chance to get further closer to it? I know getting it's exact time might be impossible but it would be nice to close the gap a lite bit more. Thanks for working on 0.7! That is already much better.

EDIT

I also tested query and query_as method:

sqlx_query_method       time:   [20.280 s 20.283 s 20.308 s]
sqlx_query_as_method    time:   [25.738 s 25.763 s 25.842 s]

Interesting that query is 6s faster then query!

EDIT2

Also tested sea-orm with uses sqlx 0.6 still:

sea_orm                 time:   [52.615 s 52.472 s 52.615 s]

Really looking forward to sqlx 0.7 with should improve it too by a lot!

Boscop commented 3 weeks ago

Any update on this? 🙂