will / crystal-pg

a postgres driver for crystal
BSD 3-Clause "New" or "Revised" License
462 stars 77 forks source link

Allow decoding directly to UUID struct #209

Closed jgaskins closed 3 years ago

jgaskins commented 4 years ago

Crystal: 0.35.1 Postgres: 12.3

One of my apps uses UUIDs for all identifiers, but this requires me to put a @[DB::Field(converter: UUIDConverter)] annotation on every single identifier and foreign key in my app models:

struct Comment
  include DB::Serializable

  @[DB::Field(converter: UUIDConverter)]
  getter id : UUID
  @[DB::Field(converter: UUIDConverter)]
  getter commentable_id : UUID
  @[DB::Field(converter: UUIDConverter)]
  getter author_id : UUID
end

So I thought it'd be really nice if I didn't have to add that annotation every time since Postgres has a UUID data type. And then I saw this comment from @RX14:

However, returning UUIDs as the new UUID struct would be something to discuss

So I wanted to at least start that discussion because it would make my application code significantly easier to read.

I also thought it might be faster since we wouldn't have to convert from a raw 128-bit representation off the wire to a chunked hex string and then back again in my app's UUIDConverter, so I ran the following benchmark:

Click for benchmark code ```crystal require "benchmark" require "db" require "uuid" require "../src/pg" pg = DB.open("postgres:///") uuid_count = 1_000_000 memory = 1 time = Benchmark.measure do bytes_before_measure = GC.stats.total_bytes pg.query_all("select uuid_generate_v4() from generate_series(1, $1)", uuid_count, as: {String}) memory = (GC.stats.total_bytes - bytes_before_measure).to_i64 end pp time: time.total, memory_per_uuid: memory // uuid_count ``` For this branch, the `{String}` was changed to `{UUID}`.

I only measured CPU time to ignore the effects of latency in talking to the DB. On my machine, it took almost 400ms of CPU time to decode 1M rows each containing 1 UUID at about 185 bytes of heap memory for each (acknowledging that most of this is not related to decoding UUIDs):

➜  crystal-pg git:(master) ✗ crystal run --release benchmarks/uuid.cr
{time: 0.39878800000000003, memory_per_uuid: 185}

Decoding directly to a UUID, we use about half of that CPU time and shave off ~39 bytes of heap memory per UUID.

➜  crystal-pg git:(decode-uuids) ✗ crystal run --release benchmarks/uuid.cr
{time: 0.174999, memory_per_uuid: 146}

Hoping to get people's thoughts on it since this is probably a breaking change for some apps that store UUIDs as the uuid type in Postgres since they currently have to treat them as strings.

straight-shoota commented 4 years ago

Looks great! I suppose the breaking change is mostly that decoding a UUID as String doesn't work anymore. But this could just continue to avoid breaking.

jwoertink commented 3 years ago

@will this would be super helpful! Any chance this could get a review?

will commented 3 years ago

Thanks for bumping this, it slipped completely from my attention. I'm unavailable for the next few days, sorry, but I'll set a reminder to look early next week.

will commented 3 years ago

Thank you for this. I think it makes a lot of sense and is a good improvement. It is a breaking change so I just released 0.22.0 with this.