isoos / postgresql-dart

Dart PostgreSQL driver: supports extended query format, binary protocol and statement reuse.
https://pub.dev/packages/postgres
BSD 3-Clause "New" or "Revised" License
131 stars 37 forks source link

v3 Regression: Multiple commands in the same `execute` #328

Open davidmartos96 opened 7 months ago

davidmartos96 commented 7 months ago

Hello! I was in the process of migrating a project using postgres v2. One thing we are doing in this project to improve the performance of batch uploads to Postgres is to run multiple INSERT statements in the same execute and it has been working great for us.

The problem we are seeing is that in v3 it throws with an error: Severity.error 42601: cannot insert multiple commands into a prepared statement

Are we missing some special configuration to support this?

Sample code:

Version 2.x

import 'package:postgres/postgres.dart';

void main(List<String> args) async {
  final connection = PostgreSQLConnection(...);
  await connection.open();

  await connection.execute('''
  CREATE TEMP TABLE todos (
    id INTEGER PRIMARY KEY,
    description TEXT NOT NULL
  )
''');

  await connection.execute(
    '''
  INSERT INTO todos (id, description) VALUES (@id_1, @description_1);
  INSERT INTO todos (id, description) VALUES (@id_2, @description_2);
  INSERT INTO todos (id, description) VALUES (@id_3, @description_3);
''',
    substitutionValues: {
      'id_1': 1,
      'description_1': 'Buy milk',
      'id_2': 2,
      'description_2': 'Buy eggs',
      'id_3': 3,
      'description_3': 'Buy bread',
    },
  );

  final result = await connection.query('SELECT * FROM todos');
  for (final row in result) {
    print('id: ${row[0]}, description: ${row[1]}');
  }

  await connection.close();
}

Version 3.x

import 'package:postgres/postgres.dart';

void main(List<String> args) async {
  final connection = await Connection.open(
    Endpoint(
      ...
    ),
  );

  await connection.execute('''
  CREATE TEMP TABLE todos (
    id INTEGER PRIMARY KEY,
    description TEXT NOT NULL
  )
''');

  await connection.execute(
    Sql.named('''
  INSERT INTO todos (id, description) VALUES (@id_1, @description_1);
  INSERT INTO todos (id, description) VALUES (@id_2, @description_2);
  INSERT INTO todos (id, description) VALUES (@id_3, @description_3);
'''),
    parameters: {
      'id_1': 1,
      'description_1': 'Buy milk',
      'id_2': 2,
      'description_2': 'Buy eggs',
      'id_3': 3,
      'description_3': 'Buy bread',
    },
  );

  final result = await connection.execute('SELECT * FROM todos');
  for (final row in result) {
    print('id: ${row[0]}, description: ${row[1]}');
  }

  await connection.close();
}
isoos commented 7 months ago

42601: cannot insert multiple commands into a prepared statement

This is coming from the postgresql server, and apparently it is coming from the extended query protocol's prepare phase.

I don't really know how this was implemented in v2, it had a non-trivial complexity around queries that I never ventured into. Maybe it converted it to simple query protocol - I am just guessing here.

In v3 we are doing mostly the minimum over the query protocols, in this case the extended query protocol (which is usually more efficient). If the server rejects the queries, my best guess is that you rewrite them into:

/cc @simolus3 for further comments

simolus3 commented 7 months ago

Maybe it converted it to simple query protocol - I am just guessing here.

Yeah, v2 essentially inlined the variables into the SQL string. If you want to do that yourself, you can set queryMode: QueryMode.simple on execute and use SQL without variables. My opinion is that this package shouldn't do these kinds of tricks, there probably are a lot of edge cases around typing that we'll get slightly wrong, and those issues are very tricky for users to diagnose.

(1) separate queries but reusing their prepared state with https://pub.dev/documentation/postgres/latest/postgres/Session/prepare.html

This is perhaps something we should investigate further. At the moment, running a prepared statement essentially locks the connection until we get a ReadyForQueryMessage (see the _scheduleStatement call in the _PgResultStreamSubscription constructor). This means that running n prepared statements requires a sequence of n roundtrips without opportunities for interleaving or concurrency.

This will complicate the state machine in this package a bit, but I think statement responses are always coming back in-order, so we could keep a queue of outstanding statements and issue all requests at once without waiting for responses. That would reduce the overall latency when batching statements.

davidmartos96 commented 6 months ago

@simolus3 That's interesting, I didn't know it was binding the values as literals. My use case for these fortunately are inserts, which we can refactor to use a single one with multiple values. Thank you!

Feel free to keep it open with a different title or close.