neos / eventstore-doctrineadapter

Event Store Doctrine Adapter (WIP)
MIT License
0 stars 4 forks source link

FEATURE: Deterministic charset and collation #14

Closed bwaidelich closed 9 months ago

bwaidelich commented 9 months ago

Resolves: #12

bwaidelich commented 9 months ago

Added some more optimizations after talking to @kitsunet (Thanks again for rubber ducking!)

I intentionally left out the binary optimization for id columns as this adds quite a lot of special handling for different database platforms. We can address this with a future release (see #16) or create dedicated adapters.

bwaidelich commented 9 months ago

I did some benchmarks to estimate the effect of this PR and unfortunately all numbers are the same or worse with the patch applied..

Performance is equally bad (there are some differences, but I didn't test in fully isolated conditions):

To write 1mio events

To read 1mio events from the $all stream

The size for 1mio events (with empty payload) increases with the patch:

Here the detailed Benchmark: https://gist.github.com/bwaidelich/c7106d6fe39ff855ae94568e01c37fbb And the diff (without => with patch): https://gist.github.com/bwaidelich/c7106d6fe39ff855ae94568e01c37fbb/revisions

kitsunet commented 9 months ago

Where is the benchmark code again? Soemthing must be off, or we missed something, I don#t expect much speed gain here given that we do not do any actual queries that could benefit from anything we do, but the size should not be bigger.

bwaidelich commented 9 months ago

@kitsunet Ah right, I forgot to add the benchmark script itself. Added it now to the gist

I don#t expect much speed gain here

Yes, I didn't either – but I wanted to verify that too

the size should not be bigger

Yes, that puzzled me as well.. But there is not so much that could go wrong really. I drop the tables for each run (and delete the SQLite file) and then used the total_size determined by TablePlus.

Here's the created table structures without and with this change applied:

Without patch (branch main)

SQLite

CREATE TABLE benchmark_events (
  sequencenumber INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  stream VARCHAR(255) NOT NULL,
  version BIGINT UNSIGNED NOT NULL,
  type VARCHAR(255) NOT NULL,
  payload CLOB NOT NULL,
  metadata CLOB DEFAULT NULL,
  id VARCHAR(255) NOT NULL,
  correlationid VARCHAR(255) DEFAULT NULL,
  causationid VARCHAR(255) DEFAULT NULL,
  recordedat DATETIME NOT NULL --(DC2Type:datetime_immutable)
)

MariaDB

CREATE TABLE `benchmark_events` (
  `sequencenumber` int(11) NOT NULL AUTO_INCREMENT,
  `stream` varchar(255) NOT NULL,
  `version` bigint(20) unsigned NOT NULL,
  `type` varchar(255) NOT NULL,
  `payload` longtext NOT NULL,
  `metadata` longtext DEFAULT NULL,
  `id` varchar(255) NOT NULL,
  `correlationid` varchar(255) DEFAULT NULL,
  `causationid` varchar(255) DEFAULT NULL,
  `recordedat` datetime NOT NULL COMMENT '(DC2Type:datetime_immutable)',
  PRIMARY KEY (`sequencenumber`),
  UNIQUE KEY `UNIQ_38ABE26EBF396750` (`id`),
  UNIQUE KEY `UNIQ_38ABE26EF0E9BE1CBF1CD3C3` (`stream`,`version`),
  KEY `IDX_38ABE26ED575A589` (`correlationid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_unicode_ci;

PostgreSQL

CREATE SEQUENCE IF NOT EXISTS benchmark_events_sequencenumber_seq;
CREATE TABLE "public"."benchmark_events" (
    "sequencenumber" int4 NOT NULL DEFAULT nextval('benchmark_events_sequencenumber_seq'::regclass),
    "stream" varchar NOT NULL,
    "version" int8 NOT NULL,
    "type" varchar NOT NULL,
    "payload" text NOT NULL,
    "metadata" text,
    "id" varchar NOT NULL,
    "correlationid" varchar DEFAULT NULL::character varying,
    "causationid" varchar DEFAULT NULL::character varying,
    "recordedat" timestamp NOT NULL,
    PRIMARY KEY ("sequencenumber")
);
COMMENT ON COLUMN "public"."benchmark_events"."recordedat" IS '(DC2Type:datetime_immutable)';

With patch (branch feature/12-deterministic-charset-and-collation)

SQLite

CREATE TABLE benchmark_events (
  sequencenumber INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  stream VARCHAR(100) NOT NULL,
  version INTEGER UNSIGNED NOT NULL,
  type VARCHAR(100) NOT NULL,
  payload CLOB NOT NULL,
  metadata CLOB DEFAULT NULL --(DC2Type:json)
  id CHAR(36) NOT NULL,
  causationid VARCHAR(40) DEFAULT NULL,
  correlationid VARCHAR(40) DEFAULT NULL,
  recordedat DATETIME NOT NULL --(DC2Type:datetime_immutable)
)

MariaDB

CREATE TABLE `benchmark_events` (
  `sequencenumber` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `stream` varchar(100) CHARACTER SET ascii COLLATE ascii_general_ci NOT NULL,
  `version` bigint(20) unsigned NOT NULL,
  `type` varchar(100) CHARACTER SET ascii COLLATE ascii_general_ci NOT NULL,
  `payload` longtext NOT NULL,
  `metadata` longtext DEFAULT NULL COMMENT '(DC2Type:json)',
  `id` char(36) CHARACTER SET ascii COLLATE ascii_general_ci NOT NULL,
  `causationid` varchar(40) CHARACTER SET ascii COLLATE ascii_general_ci DEFAULT NULL,
  `correlationid` varchar(40) CHARACTER SET ascii COLLATE ascii_general_ci DEFAULT NULL,
  `recordedat` datetime NOT NULL COMMENT '(DC2Type:datetime_immutable)',
  PRIMARY KEY (`sequencenumber`),
  UNIQUE KEY `UNIQ_38ABE26EBF396750` (`id`),
  UNIQUE KEY `UNIQ_38ABE26EF0E9BE1CBF1CD3C3` (`stream`,`version`),
  KEY `IDX_38ABE26ED575A589` (`correlationid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_unicode_ci;

PostgresSQL

CREATE SEQUENCE IF NOT EXISTS benchmark_events_sequencenumber_seq;
CREATE TABLE "public"."benchmark_events" (
    "sequencenumber" int8 NOT NULL DEFAULT nextval('benchmark_events_sequencenumber_seq'::regclass),
    "stream" varchar NOT NULL,
    "version" int8 NOT NULL,
    "type" varchar NOT NULL,
    "payload" text NOT NULL,
    "metadata" json,
    "id" bpchar NOT NULL,
    "causationid" varchar DEFAULT NULL::character varying,
    "correlationid" varchar DEFAULT NULL::character varying,
    "recordedat" timestamp NOT NULL,
    PRIMARY KEY ("sequencenumber")
);
COMMENT ON COLUMN "public"."benchmark_events"."recordedat" IS '(DC2Type:datetime_immutable)';
bwaidelich commented 9 months ago

I'm going ahead and merge this anyways. The numbers are not better, but they are not really much worse either and the changes still make sense to me. We can always fix/adjust things in future bugfix/feature releases WDYT, @kitsunet ?

kitsunet commented 9 months ago

Yep fine by me, I can test this myself over hte next days, I think the direction is definitely right.