JasperFx / marten

.NET Transactional Document DB and Event Store on PostgreSQL
https://martendb.io
MIT License
2.84k stars 450 forks source link

Inline mutlistream projection ships/loses events under heavy load #3488

Closed Astenna closed 1 week ago

Astenna commented 1 week ago

We're using marten 7.29. The provided sample app uses 7.30.2.

We have a console app that calls our API to seed the database with initial data. We noticed, that when one of our endpoints starting a new stream is invoked in parallel, the inline multistream projection does not applies some of the events that started the new stream.

The events were properly applied when we tried to rebuild that projection with async deamon. Similarly, if the multistream projection is registered as async, events are also applied correctly. The problem only exists if we register the multistream projection as inline AND we run many requests in parallel (importing our initial data sequentially results in correct application of events).

To illustrate the issue, I created a small project that mimics our setup. To run it, set the connection string in appsettings.json, start the API and then the DataSeeder console app. The console app will start seeing data using the API when you type 'start'.

InlineMultiStreamProjectionConcurrencyIssue.zip

jeremydmiller commented 1 week ago

Yeah, I'm sorry, this one is just very strongly advising you not to do multi-stream projections as Inline if there's even the slightest bit of contention across threads. You're really going to need to run those asynchronously. I don't think there's anything we could do about that

"The problem only exists if we register the multistream projection as inline AND we run many requests in parallel" -- the docs should have made this clear, but I'll revisit those. Really, really don't recommend using a multi-stream projection as Inline because of exactly what happened to you. Just not feasible

Astenna commented 1 week ago

We saw the warning in the documentation adivising that it is recommended to register multistream projections as async, but we still wanted to avoid eventual consistency, so tried to used inline. Everything seemed to work fine until we designed our models in a way that lead to bigger number of concurrent edits on the same projected entity. From my perspective, I'd be nicer if the docs explicitly mention what is worst case scenario that may happen (=e.g. not applying all the events). Previously, I thought the biggest side-effect of inline mutlistream projection would be just long duration of the request.

Thanks for prompt response!

jeremydmiller commented 1 week ago

Sigh, it's a bit icky, but if eventual consistency is an issue, you can use the relatively recent WaitForNonStaleDataAsync() extensions.

In Discord just now, I started a discussion about flat out disallowing this usage without a flag saying "it's okay, I know the risks and mean to do this anyway"

jeremydmiller commented 1 week ago

@Astenna See this one: https://github.com/JasperFx/marten/issues/3489