mojolicious / mojo-pg

Mojolicious :heart: PostgreSQL
https://metacpan.org/release/Mojo-Pg
Artistic License 2.0
99 stars 46 forks source link

Problems when inserting into a distributed table using the Citus extension #21

Closed ath88 closed 8 years ago

ath88 commented 8 years ago

I have constructed a distributed table with the following SQL:

  CREATE TABLE user (
    id INTEGER NOT NULL,
    PRIMARY KEY(id)
  );
  SELECT master_create_distributed_table('user', 'id', 'hash');
  SELECT master_create_worker_shards('user', 4, 2);

I have a subroutine that performs an insert with an ID (note that I don't use SERIAL, since distributed tables can't autoincrement an ID). The code looks like this;

sub insert {
  my $id = shift;
  say "Trying insert: $id";
  try {
    $pg->db->query('INSERT INTO customer (id) VALUES (?);', $id);
  }
  catch {
    say "Failed insert: $_";
#    $pg = Mojo::Pg->new($postgres);
  };}

Running this function 50 times yields some strange results. After the 6th insert, every request dies with the following error: DBD::Pg::st execute failed: ERROR: cannot plan INSERT using row with NULL value in partition column at test.pl line 43.

This is an error message from the Citus extension. Somehow, Citus believes that value in the partition column (id) is null in my insert statement, and therefore denies to insert the data. When I 'new up' the Mojo::Pg, the next 6 requests succeeds, and the 7th fails again.

If I don't use variable binding, the error is not encountered.

Expected behavior

I expect that my inserts will work, even with variable binding.

Actual behavior

Every insert after the 6th insert fails.

Reproducing

The easiest way to reproduce, is to deploy a small Citus cluster using docker-compose. Check out the Docker Hub here: https://hub.docker.com/r/citusdata/citus/

Then the SQL can be loaded into the master, and the subroutine can be called.

kraih commented 8 years ago

Afraid i don't see anything Mojo::Pg specific here, perhaps you should file a bug against DBD::Pg?

ath88 commented 8 years ago

@kraih The problem doesn't occur when using DBD::Pg, so I would naturally assume that it is related to Mojo::Pg. Using the following code does not provoke the error:

sub insert {
  my $id = shift;
  say "Trying insert: $id";
  try {
    $dbh->do('INSERT INTO customer (id) VALUES (?);', undef, $id);
  }
  catch {
    say "catched:" . $_;
  };
}
kraih commented 8 years ago

Just calling do is not exactly the same as the prepare/execute dance Mojo::Pg does.

ath88 commented 8 years ago

Thanks! Using a prepared statement, I encounter the same problem. Ill bring this to DBD::Pg. :)

thn-iron commented 8 years ago

Interesting problem... I'm doing something similar on Java side and have seen this kind of behavior too. I'm not sure if it's the driver issue

thn-iron commented 8 years ago

I found out why it behaves this way at least with Java and JDBC driver that I'm using... it has something to do with prepareThreshold parameter (see link below)

https://jdbc.postgresql.org/documentation/80/connect.html

By increasing the value, one can insert more records sequentially... and by setting it to zero, it appears to "turn off" the limit.

I suggest to check DBD::Pg driver to see if it has the same or equivalent parameter or not. If not, find a way to control the threshold.