will / crystal-pg

a postgres driver for crystal
BSD 3-Clause "New" or "Revised" License
463 stars 78 forks source link

UUID's get corrupted when fetching many #148

Closed horrendo closed 5 years ago

horrendo commented 6 years ago

I'm pretty sure this is reproducible.

In psql, create a test table with 10K UUID's:

log=> create table test(id uuid primary key);
CREATE TABLE
log=> insert into test select gen_random_uuid() from (select unnest(array_fill(null::int, array[10000]))) as t;
INSERT 0 10000

Run this simple crystal program:

Steve:pg_csvlog stbaldwin$ cat t.cr
require "db"
require "pg"

class DBUri < URI
  def initialize(**args)
    self.scheme = "postgres"
    self.user = args["user"]? || ENV["PGUSER"]
    self.password = args["password"]? || ENV["PGPASSWORD"]?
    self.host = args["host"]? || ENV["PGHOST"]?
    self.port = ENV["PGPORT"]?.try &.to_i
    self.path = "/" + ENV["PGDATABASE"]
  end
end

begin
  db = DB.open DBUri.new
  ids = Set(String).new
  puts "Populating set of ids"
  db.query("select id from test") do |rset|
    rset.each do
      ids.add(rset.read(String))
    end
  end
  puts "I found #{ids.size} ids"
  n_fail = 0
  ids.each do |id|
    begin
      dummy = db.scalar("select 0 from test where id = $1", id)
    rescue e
      puts "I didn't find id: #{id}"
      n_fail += 1
    end
  end
  puts "I failed #{n_fail} times"
ensure
  if db
    db.close
  end
end

Running it:

Steve:pg_csvlog stbaldwin$ crystal t.cr
Populating set of ids
I found 10000 ids
I didn't find id: 4c6d1fce-2fcf-4a11-a8ee-e3ee1fce0000
I didn't find id: e40d0000-a3c7-d5fb-4045-5f9294f9cae8
I didn't find id: acb51322-f639-40cd-a8a8-c4e813220000
I didn't find id: f9979251-06f1-43f1-84a3-5e5299fa1544
I didn't find id: 71810000-7897-46e4-46ed-992acb9d2504
I didn't find id: ce1d1834-2257-4e41-aad6-5179b3340000
I didn't find id: 33140a00-292d-f447-dc95-550f40060cc1
I didn't find id: 012c249b-9e50-4e19-96e0-c6df158c0000
I didn't find id: 99674813-837d-4ee7-98e7-fa62c9712c92
I didn't find id: 2c7db931-a218-4220-93b5-d062f7614d00
I didn't find id: e06c5285-7e6c-594d-888c-f982e7d46944
I didn't find id: 62083ace-9746-40b5-a699-34993ace0000
I failed 12 times

I suspect it has something to do with buffer boundaries - but this is just a guess. If, for example I change the first select to:

select id from test limit 909

Running it I get:

Steve:pg_csvlog stbaldwin$ crystal t.cr
Populating set of ids
I found 909 ids
I didn't find id: 4c6d1fce-2fcf-4a11-a8ee-e3ee1fce0000
I failed 1 times

If I change the limit to 908 I get this:

Steve:pg_csvlog stbaldwin$ crystal t.cr
Populating set of ids
I found 908 ids
I failed 0 times

Alternatively, if I cast the UUID column to TEXT in the initial select:

select id::text from test

It works fine:

Steve:pg_csvlog stbaldwin$ crystal t.cr
Populating set of ids
I found 10000 ids
I failed 0 times

The actual data doesn't seem to matter. I can truncate and repopulate the table and it still fails the same number of times - albeit with different values. I did notice the 'corruption' of the uuid values does seem somewhat consistent. For example, here is another run after truncating and re-populating the table:

Steve:pg_csvlog stbaldwin$ crystal t.cr
Populating set of ids
I found 10000 ids
I didn't find id: b60a7a39-a0de-44e9-848a-788a7a390000
I didn't find id: 1e0c0000-4c76-5cbd-f346-14b91da9a793
I didn't find id: ef8c787e-a6ae-407d-8c66-c099787e0000
I didn't find id: bfdb0097-9c76-4676-78a0-3659c7177d8a
I didn't find id: df2f0000-7a89-7620-4555-a42246877bf6
I didn't find id: 84f4e3e6-0878-4d61-a945-398960e60000
I didn't find id: 2ee71c00-195b-6247-2f86-878cd53850b5
I didn't find id: c6632723-d2c2-41f2-9dff-6ef5088a0000
I didn't find id: b49ac06c-f31c-4b12-8412-eeef4f2da1e9
I didn't find id: 2d2818b7-b93a-48a3-aafb-7a4def6afd00
I didn't find id: 97f1e6a6-acf1-ee4a-11ae-5dc9809772d9
I didn't find id: bc55afe3-195c-4cf2-abd8-16d8afe30000
I failed 12 times

Note the corrupted id's have '0000' at the end in both cases on entries 1, 3, 6, 8 and 12.

In case it matters, I'm using postgres 10.4 and this version of crystal:

Steve:pg_csvlog stbaldwin$ crystal --version
Crystal 0.25.1 (2018-06-29)

LLVM: 5.0.2
Default target: x86_64-apple-macosx

Cheers,

Steve

aniliyidogan commented 5 years ago
➜  ~ crystal --version
Crystal 0.29.0 (2019-06-06)

LLVM: 6.0.1
Default target: x86_64-apple-macosx

I'm having the same problem. Did you find a solution that you could apply directly? @horrendo

horrendo commented 5 years ago

Sorry, I didn't find a solution. The 'cast-to-text' workaround is as close as I got.

aniliyidogan commented 5 years ago

Thanks @horrendo. Unfortunately, I will solve it that way.

asterite commented 5 years ago

I might take a look at this tomorrow.