sfackler / rust-postgres

Native PostgreSQL driver for the Rust programming language
Apache License 2.0
3.48k stars 443 forks source link

Tokio_postgres hangs on transaction #983

Closed andrewbaxter closed 1 year ago

andrewbaxter commented 1 year ago

It's likely I'm doing something stupid since this is so trivial, but I figure I'd put it here anyway in case someone else has the issue. I did a search and found some other issues related to hangs, but they seemed abandoned or fixed.

I'm using testcontainers (docker containers) for integration testing some db code. I've confirmed the database comes up, and doing docker exec -it psql I can execute queries, list active connections, so that seems okay.

tokio-postgres = "0.7.7"

The test setup and code is:

async fn db<
    'a,
>(
    docker: &testcontainers::clients::Cli,
) -> (
    tokio_postgres::Client,
    tokio_postgres::Connection<tokio_postgres::Socket, tokio_postgres::tls::NoTlsStream>,
    Container<'_, Postgres>,
) {
    let db_container = docker.run(Postgres::default());
    let mut db_config = Config::new();
    db_config.host("127.0.0.1");
    db_config.dbname("postgres");
    db_config.user("postgres");
    db_config.port(db_container.get_host_port_ipv4(5432));
    let (db, db_conn) = db_config.connect(tokio_postgres::NoTls).await.unwrap();
    (db, db_conn, db_container)
}

#[tokio::test]
async fn test_base() {
    let docker = testcontainers::clients::Cli::default();
    let (mut db, _conn, _cont) = db(&docker).await;
    println!("1");
    pg_gen_insert2::migrate(&mut db).await.unwrap();
    println!("2");
...
}

and then migrate does

pub async fn migrate(db: &mut tokio_postgres::Client) -> Result<(), GoodError> {
    println!("m0a");
    let txn = db.transaction().await.map_err(|e| GoodError(e.to_string()))?;
    println!("m0b");
...
}

The test output hangs at

$ cargo test -p integrationtests -- --nocapture
   Compiling integrationtests v0.1.0 (/.../integrationtests)
    Finished test [unoptimized + debuginfo] target(s) in 1.95s
     Running tests/pg.rs (target/debug/deps/pg-8afa326beb695e30)

running 1 test
1
m0a
test test_base has been running for over 60 seconds

The docker container is

CONTAINER ID   IMAGE                COMMAND                  CREATED         STATUS         PORTS                                         NAMES
1a1d0caea410   postgres:11-alpine   "docker-entrypoint.s…"   3 minutes ago   Up 3 minutes   0.0.0.0:49160->5432/tcp, :::49160->5432/tcp   jovial_payne

I see this netstat -anpo output:

tcp        0      0 127.0.0.1:47666         127.0.0.1:49160         ESTABLISHED 3822113/pg-8afa326b  keepalive (7020.69/0/0)

This lsof -n output for the test:

$ sudo lsof -n -p 3822113
COMMAND       PID   USER   FD      TYPE   DEVICE SIZE/OFF     NODE NAME
pg-8afa32 3822113 andrew  cwd       DIR    254,3     4096  7497248 /.../integrationtests
pg-8afa32 3822113 andrew  rtd       DIR    254,3     4096        2 /
pg-8afa32 3822113 andrew  txt       REG    254,3 44424392  7385948 /.../target/debug/deps/pg-8afa326beb695e30
pg-8afa32 3822113 andrew  mem       REG    254,3  1953472 15993745 /usr/lib/libc.so.6
pg-8afa32 3822113 andrew  mem       REG    254,3   944600 15993766 /usr/lib/libm.so.6
pg-8afa32 3822113 andrew  mem       REG    254,3   571848 16019409 /usr/lib/libgcc_s.so.1
pg-8afa32 3822113 andrew  mem       REG    254,3   216184 15993732 /usr/lib/ld-linux-x86-64.so.2
pg-8afa32 3822113 andrew    0u      CHR   136,19      0t0       22 /dev/pts/19
pg-8afa32 3822113 andrew    1u      CHR   136,19      0t0       22 /dev/pts/19
pg-8afa32 3822113 andrew    2u      CHR   136,19      0t0       22 /dev/pts/19
pg-8afa32 3822113 andrew    3u  a_inode     0,14        0     3121 [eventpoll:4,6]
pg-8afa32 3822113 andrew    4u  a_inode     0,14        0     3121 [eventfd:97]
pg-8afa32 3822113 andrew    5u  a_inode     0,14        0     3121 [eventpoll:4,6]
pg-8afa32 3822113 andrew    6u     IPv4 67267962      0t0      TCP 127.0.0.1:47666->127.0.0.1:49160 (ESTABLISHED)

psql connections query:

postgres=# SELECT client_addr, query, backend_type FROM pg_stat_activity;
 client_addr |                             query                              |         backend_type         
-------------+----------------------------------------------------------------+------------------------------
             |                                                                | autovacuum launcher
             |                                                                | logical replication launcher
 172.17.0.1  |                                                                | client backend
             | SELECT client_addr, query, backend_type FROM pg_stat_activity; | client backend
             |                                                                | background writer
             |                                                                | checkpointer
             |                                                                | walwriter
(7 rows)

I assume the 2nd client backend is my psql session in-container, so my guess is 172.17.0.1 is the test.

andrewbaxter commented 1 year ago

It was because db_conn wasn't awaited.