johanhaleby / occurrent

Unintrusive Event Sourcing Library for the JVM
https://occurrent.org
120 stars 16 forks source link

(WIP) JPA support for blocking event strore #151

Closed nicklundin08 closed 1 week ago

nicklundin08 commented 7 months ago

Disclaimer

:warning: This PR is a WIP (Im still struggling with some test stuff) :warning: I havent given much thought to how other modules (besides the blocking event store) might be implemented using JPA. (e.g. something like subscriptions is probably more provider specific)

JPA support for blocking event store

Im curious if you'd be open to supporting JPA implementations of some of the Occurrent modules.

Why

Ill be the first to admit that I'm not a big fan of JPA. Especially for a project like Occurrent where the data access patterns are very clearly defined I thinks its probably more optimal to simply use a native driver directly.

That being said, supporting JPA might be an easy way to very quickly gain support for many different underlying stores (postgres, mysql, dynamo, redis, etc)

Additionally, it would not preclude the project from also support native drivers for those stores in the future.

Design choices

I mostly copied the Mongo implementation, tweaking some things for readability/organization. The Query DSL stuff is implemented using JPA's Specification library (see first link at bottom)

Lombok

I mostly just added this because I wanted to iterate quickly. If youd rather the project not use lombok, Id be happy to remove it!

Mixins

A lot of the heavy lifting that maps the query dsl functionality to JPA's query functionality is done via default interface methods. I called these things Mixins (I dont know if thats the most accurate technical term but :shrug:)

This pattern allows the library to specify a lot of default functionality but gives consumers the ability to override things if required.

Batteries not included

With this implementation, some assembly is required from the consumer of the JPA library. To get an idea of what consuming this library might look like, check out the batteries package in the test source set.

This pattern gives the consumer more control over the underlying concrete type that will be used

Resources

WIP

I left this as a WIP because I wanted to get some feedback + see if you're even interested in merging it before I go any further. Let me know!

johanhaleby commented 7 months ago

What a great initiative @nicklundin08! I've wanted to have a SQL implementation for a long time, but I've never had time to think about it or implement it myself. My initial thoughts were to use JDBC or something (just as you hinted at), but maybe JPA would work.

Could you maybe describe the thinking behind your design decisions? For example, if I understand it correctly, you serialize the data part of the CloudEvent to JSON as string. My idea was perhaps to use the native JSON type that I think exists in both MySQL and Postgres (and maybe others)? Not sure if JPA supports this though.

Also, do you have any experience with change streams? That's more or less required if you want a working application :)

nicklundin08 commented 7 months ago

Thanks for the reply!

Could you maybe describe the thinking behind your design decisions? For example, if I understand it correctly, you serialize the data part of the CloudEvent to JSON as string. My idea was perhaps to use the native JSON type that I think exists in both MySQL and Postgres (and maybe others)? Not sure if JPA supports this though.

Yeah wrt the data structure that you see in the init tables function, I was simply going fast. I think I better approach would be to either use the JSONB column type that postgres supports OR do something like convert the cloud event to 3NF. Im not sure if we could use the JPA specifications to grab nested objects outside of a JSONB column. Im sure once I implement some more tests it will become clear what the appropriate table structure would be

Disclaimer: I have no production experience with event sourcing

Also, do you have any experience with change streams? That's more or less required if you want a working application :)

Is this in the context of implementing subscriptions? I took a peek at the mongo implementation and I think I see whats going on. Correct me if Im wrong but under the hook you are using the mongodb changestream functionality to trigger various functions that "subscribe" to certain events (or types of events)

If so I can think of a few ways that you could do that in RDS land

Option1: Use JPA with polling

This would require/assume a few things

Option 2: Use JPA's auditing entity listener

https://www.baeldung.com/jpa-entity-lifecycle-events

:warning: Unsure on what kind of delivery guarantees/data loss would be present in this solution. Needs more analysis

Option 3: Use a provider specific equivalent of mongodb's changestream

Postgres has the listen/notify concept

MySQL might have something similar

Option 4: CDC using integration tools

Use a solution like dbezium to move events from your RDS system into a kafka stream. Subscribe to the kafka stream.


Option #1 would be the most generic to implement as it doesnt require any provider specific functionality nor an additional piece of infrastructure, but Im not sure if polling would cause other issues/not be an acceptable solution for the changestream problem

Do any of those jump out at you as the right place to start?

Does occurrent have any particular delivery/order guarantees with subscriptions?

nicklundin08 commented 7 months ago

On another note, is there anything youd like to see code-wise before considering merging? Heres a few things Id like to do before marking this as "ready"


Let me know if theres anything youd like me to add/remove!

johanhaleby commented 7 months ago

Thanks for all your efforts. However, I think I would like to see better JSON support, i.e. not storing the data as a string :/ The reason is that I'd like to support query capabilities (https://occurrent.org/documentation#eventstore-queries), also inside the data property in the cloud event (I'm using this in production atm and I think it can be quite nice). If this is not possible with JPA, I think it might be better to implement support using other means (jdbc if that works). I don't have any experience with this myself though, but I guess it should work in both Postgres, MySQL and probably others.

Another thing I'm thinking of is subscription support (https://occurrent.org/documentation#subscriptions). Without subscriptions, it's hard to do anything useful. I think that we need subscription support from the get-go. I don't want to bring in Debezium unless it's really needed (to keep dependencies down), but maybe it would be a good starting point if it's too difficult to achieve it ourselves. I don't know if JDBC supports change streams or if one would need to write different change streams for different implementations. If so Debezium might be a more lucrative option, given that it supports everyone we want to do.

WDYT?

nicklundin08 commented 6 months ago

Hey sorry for the late reply. I was on vacation last week.


Re: JSON

Gotcha. Yeah I think have JSON support in a pretty good spot ATM (see above comment)


Re: Subscriptions

Makes sense not going down a route that makes use of dbezium or something like that

I think the two options that could be implemented then are

Do you prefer either of those options?


Ive got 14 failing tests that I need to wrap up before I take a stab at subscriptions

frederikb commented 6 months ago

You should check out https://github.com/eugene-khyst/postgresql-event-sourcing in which the author has detailed a robust approach for implementing asynchronous subscribers using both polling as well as listen/notify on PostgreSQL. The readme goes into nice detail regarding the care that must be taken due to parallel transactions.

The discussion on Hacker News regarding that reference implementation gives further insight and mentions some alternative approaches: https://news.ycombinator.com/item?id=38084098

Perhaps you can ask @eugene-khyst whether or not he is interested in collaboration.