JasperFx / marten

.NET Transactional Document DB and Event Store on PostgreSQL
https://martendb.io
MIT License
2.85k stars 453 forks source link

Partitioning support for Event Store #770

Closed jokokko closed 3 months ago

jokokko commented 7 years ago

To avoid table bloat with the Marten Event Store, events table (mt_events by default) could be partitioned using PostgreSQL table partitioning.

While the table can already be manually partitioned after creation with table inheritance in PG 9.x, PG 10 will introduce native table partitioning with new prerequisites (and limitations): https://www.postgresql.org/docs/10/static/ddl-partitioning.html Given that "It is not possible to turn a regular table into a partitioned table or vice versa.", it would be extremely useful to be able to define a partitioning scheme upfront as Marten creates the necessary tables.

cocowalla commented 7 years ago

It would be great to also support this on arbitrary tables when using Marten as a document database, not just as an Event Store.

jokokko commented 7 years ago

Indeed. But if it's implemented for Event Store tables (well, mainly mt_events), should not be too big a step to generalize for all tables. The hardest thing likely is to offer a flexible way to define the actual partitioning scheme. Then again, being able to do it via Marten would be a huge advantage, as, afaik, tables cannot be partitioned afterwards (talking about native partitioning in PG 10).

turbo commented 6 years ago

I'm relying on Marten for most persistence in my app, but there are two massive tables that are still raw PGSQL because they need partitioning supports (by tenant, timestamp). It would be nice to handle this in Marten.

Although I'd urge you to wait for PG11, which will have quite a few partition improvements.

oskardudycz commented 6 years ago

@turbo currently you can use Conjoined Tenancy that will at least give you possibility to split it by tenant (also stream and events table). I'll have a look on the PG partitioning in PG 10.

isen-ng commented 5 years ago

Seems like PG11 has been released https://www.postgresql.org/about/news/1894/

jeremydmiller commented 5 years ago

My thoughts:

Just use what Postgresql 11 can do

If we programmatically "route" events to separate tables

Async Daemon

oskardudycz commented 5 years ago

@jeremydmiller I think that maybe we could start with Native partitioning by stream type? Imho this could be the easiest way to start investigation around that topic. I could try to work on PoC. (eg. by adding new Tenancy "ByStreamType").

If we have the paritioning by stream type then we could try to extend that with the next levels like by date, stream id etc.

I agree that duplicate field would be helpfull and that it would be worth to checking if the MetaData would help us on partitioning.

Imho it would be also worth checking TimescaleDB as it might potentialy simplify that process a lot (@cocowalla did already some initial investigations).

p.s. nice introduction to partitioning https://severalnines.com/blog/how-take-advantage-new-partitioning-features-postgresql-11

jeremydmiller commented 3 years ago

@oskardudycz @mysticmind Getting back into this a little bit today. Some thoughts:

jeremydmiller commented 3 years ago

From the other day, @mysticmind, @oskardudycz, and I talked about:

There's some opportunity to thin down the indexing for speed.

jeremydmiller commented 3 years ago

This isn't a slam dunk, and it's going to add some extra work to users. It makes perfect sense to partition against:

For indexes:

nkosi23 commented 1 year ago

What would be the migration story of such a feature? Would there be a way for existing users to leverage such a new feature without too much pain?

oskardudycz commented 1 year ago

@nkosi23, we're still in the planning phase, we'd for sure provide the migration guide, but it may require copying data from old tables to new if you want to enable partitioning. See more in: