Closed jokokko closed 3 months ago
It would be great to also support this on arbitrary tables when using Marten as a document database, not just as an Event Store.
Indeed. But if it's implemented for Event Store tables (well, mainly mt_events), should not be too big a step to generalize for all tables. The hardest thing likely is to offer a flexible way to define the actual partitioning scheme. Then again, being able to do it via Marten would be a huge advantage, as, afaik, tables cannot be partitioned afterwards (talking about native partitioning in PG 10).
I'm relying on Marten for most persistence in my app, but there are two massive tables that are still raw PGSQL because they need partitioning supports (by tenant, timestamp). It would be nice to handle this in Marten.
Although I'd urge you to wait for PG11, which will have quite a few partition improvements.
@turbo currently you can use Conjoined Tenancy that will at least give you possibility to split it by tenant (also stream and events table). I'll have a look on the PG partitioning in PG 10.
Seems like PG11 has been released https://www.postgresql.org/about/news/1894/
My thoughts:
StartStream()
and that hasn't been popular. Could you do it by event type? That's easier to handle in the routing in a way. @jeremydmiller I think that maybe we could start with Native partitioning by stream type? Imho this could be the easiest way to start investigation around that topic. I could try to work on PoC. (eg. by adding new Tenancy "ByStreamType").
If we have the paritioning by stream type then we could try to extend that with the next levels like by date, stream id etc.
I agree that duplicate field would be helpfull and that it would be worth to checking if the MetaData would help us on partitioning.
Imho it would be also worth checking TimescaleDB as it might potentialy simplify that process a lot (@cocowalla did already some initial investigations).
p.s. nice introduction to partitioning https://severalnines.com/blog/how-take-advantage-new-partitioning-features-postgresql-11
@oskardudycz @mysticmind Getting back into this a little bit today. Some thoughts:
From the other day, @mysticmind, @oskardudycz, and I talked about:
There's some opportunity to thin down the indexing for speed.
This isn't a slam dunk, and it's going to add some extra work to users. It makes perfect sense to partition against:
is_archived
by list. This would be tremendously helpful for performance if folks would be rigorous about moving event streams to the archived statetenant_id
by either list or range if it exists. And I'd allow the users to choose how to do that. For indexes:
What would be the migration story of such a feature? Would there be a way for existing users to leverage such a new feature without too much pain?
@nkosi23, we're still in the planning phase, we'd for sure provide the migration guide, but it may require copying data from old tables to new if you want to enable partitioning. See more in:
To avoid table bloat with the Marten Event Store, events table (mt_events by default) could be partitioned using PostgreSQL table partitioning.
While the table can already be manually partitioned after creation with table inheritance in PG 9.x, PG 10 will introduce native table partitioning with new prerequisites (and limitations): https://www.postgresql.org/docs/10/static/ddl-partitioning.html Given that "It is not possible to turn a regular table into a partitioned table or vice versa.", it would be extremely useful to be able to define a partitioning scheme upfront as Marten creates the necessary tables.