Closed codeliner closed 4 years ago
Some more tests + documentation is missing, but I'd like to collect some feedback on the PR. We will run some experiments next week, to check if gap detection solves the issue of skipped events.
I'm not sure about those predefined timing windows. Some transactions might be long-running and the sleep times are fairly short. From what I saw in the past, those gabs are pretty rare, so even if the projection falls behind for 5 secs, that's still acceptable in most situations, so maybe we should add some longer retry times as well. But without data that's only speculation.
"normal gaps" can occur very easily. They are caused by concurrency exceptions. That's the reason why I'd like to avoid a long sleep time by default. 50ms are already a very long time for 99% of write operations in a typical event sourced implementation.
This is pretty awesome stuff and I'll put this to testing on our side this week too!
Cool, looking forward to your results @fritz-gerneth
We performed first tests yesterday but still had gaps with the default sleep times. So I guess @prolic is right that we need longer defaults. But we're testing Postgres event-store without GET_LOCK
.
Hopefully we can perform some more tests tomorrow.
GET_LOCK
+ gap detection should work with short sleep times, right? Because in that case faulty gaps only occur when one aggregate records multiple events with different payload size in one transaction.
GET_LOCK + gap detection should work with short sleep times, right? Because in that case faulty gaps only occur when one aggregate records multiple events with different payload size in one transaction.
I think so.
We ran error-prone operations a couple of thousand times today without any issue (default gap detection settings). Error-prone in the magnitude of "once every 1k - 2k saves" though..
Great news @fritz-gerneth :tada:
We can also confirm that for our case a longer sleep time fixed the issue (No GET_LOCK, only gap detection). We're trying to balance sleep time vs. gaps now.
So I'll definitely continue with the PR and finish it. Sleep times need to be documented anyway, but I think I'd go with @prolic suggestion and increase the default sleep time. If it is too slow for someone, they can simply override the setting and define their own.
I think you can still use the sleep times you have right now, just add more retries with larger intervals. I would be totally fine with a projection hanging 5 secs every now and then, when it means it processes correctly. Way better than handling stuff quick and still having gabs.
@codeliner I tried the solution on Postgres with max retry of 100ms as well as windows time of 120s and without GET_LOCK
strategy.
So far It works good without missing events within 10,000 events in average of 15-20 events/second.
Please Note: for example with our current implementation to override the PostgresSingleStreamStrategy
or MySqlSingleStreamStrategy
, this might break any service that does the same because of these changes https://github.com/prooph/pdo-event-store/commit/2b2f943b717487a2a045a168186246102c787b86#diff-38bb66b2f052fb5ca610f76079f40f3f, would be nice if could take that into consideration during the release :)
Thanks alot for the fix!
thx @ahmed-alaa I started a discussion about BC break for persistence strategies. This will not be included in a 1.12 release: https://github.com/prooph/pdo-event-store/pull/217#issuecomment-592454046
@prolic I guess you mean this PR in your question, right?
I try to finish it on Sunday.
Perfect, thanks
On Fri, Feb 28, 2020, 18:34 Alexander Miertsch notifications@github.com wrote:
@prolic https://github.com/prolic I guess you mean this PR in your question https://github.com/prooph/pdo-event-store/pull/217#issuecomment-592737725, right? Open Todos
- more tests
- Documentation
I try to finish it on Sunday.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/prooph/pdo-event-store/pull/221?email_source=notifications&email_token=AADAJPDIM5FTACPVOXTIN6LRFF7PLA5CNFSM4KZVGKQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENKHY2I#issuecomment-592739433, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADAJPB6NY26EDQ2ZBTYWATRFF7PLANCNFSM4KZVGKQA .
@prolic ready to merge
Very nice work, I personally used a Fibonacci sequence (0, 5, 5, 10, 15, 25, ... until 2s) as GapDetection options. Thanks
Open Todos
* [x] more tests * [x] Documentation
I guess the OPTION_GAP_DETECTION
option should be added here:
Fixes https://github.com/prooph/pdo-event-store/issues/189
The PR adds a "gap detection" mechanism to projections. This means, that if the projection detects a gap in the stream position while processing events, it pauses processing for a configurable time, before rereading events from the last known position. Gap detection is disabled by default.
Enabling gap detection for a projection:
Enabling gap detection for a read model projection with custom configuration:
Detection window
If you set a detection window, gap detection is only performed on events not older than
NOW - window
If the detection window is set toPT60S
, only events created within the last 60 seconds are checked. Be careful when settingEvent::createdAt
manually (for example during a migration).