liquidprojections / LiquidProjections

Liquid Projections supports building and maintaining autonomous .NET projection code in an Event Sourcing architecture.
https://www.liquidprojections.net
MIT License
169 stars 25 forks source link

Including old events with NULL checkpoint token #91

Closed galenp closed 7 years ago

galenp commented 7 years ago

Hello,

In our NEventStore we have a fair number of events that were stored before the checkpoint token implementation.

When using the NEventStore polling client if we pass in NULL as the reference checkpoint token then it retrieves these old events. We do this when we do our replays.

I haven't checked what it's using to sort them, timestamp maybe? but they do come out in the correct order tho.

I was not able to get LiquidProjections to do this, could you offer some guidance?

Thanks

dennisdoomen commented 7 years ago

That could be a NES limitation. We just call IPersistStreams.GetFrom. Are you getting an exception of some sort?

galenp commented 7 years ago

Nah i think what it does is uses 0 as the default instead of NULL and skips all the old ones.

The PollingClient has this signature public override IObserveCommits ObserveFrom(string checkpointToken = null); public override IObserveCommits ObserveFromBucket(string bucketId, string checkpointToken = null);

dennisdoomen commented 7 years ago

OK, should be easy to fix.

dennisdoomen commented 7 years ago

@galenp does NES still assign a checkpoint token to those commits? And which version of NES are you using?

galenp commented 7 years ago

NES 5.2 specifically. I'm not aware that it has changed in 6.

Magic happens here I think; https://github.com/NEventStore/NEventStore.Persistence.RavenDB/blob/master/src/NEventStore.Persistence.RavenDB/RavenPersistenceEngine.cs

And a commit of interest; https://github.com/NEventStore/NEventStore.Persistence.RavenDB/commit/b3c248a026633b10c41383a72a9cc9f99459f499

dennisdoomen commented 7 years ago

@galenp Do you have an example of a commit like that?

galenp commented 7 years ago

So the scenario is a NULL checkpoint token. So on 5.2 the NES default polling client works as it uses string for CPT arguments.

The main polling client flow is here; https://github.com/NEventStore/NEventStore/blob/master/src/NEventStore/Client/PollingClient.cs

The method is DoPoll() and that calls IPersistStreams.GetFrom which also takes a string.

Which now throws the problem to each persistence engine implementation.

The link to the Raven DB persistence engine in my previous comment is what I've used and it worked. Looking at that it just does a long.parse which I guess treats a null string as 0.

I guess the change needed is to update liquid projections to work the same way. Have CPT as a string that's passed into NES instead of a numeric.

Sorry if this comment sucks, on phone.

IharBury commented 7 years ago

Null string is treated as -1 actually: https://github.com/NEventStore/NEventStore/blob/master/src/NEventStore/LongCheckpoint.cs#L36

But I still cannot really see how this can work with commits that don't have a checkpoint. Actually the checkpoint number is non-nullable long in NEventStore.Persistence.RavenDB at the moment: https://github.com/NEventStore/NEventStore.Persistence.RavenDB/blob/master/src/NEventStore.Persistence.RavenDB/RavenCommit.cs#L25

galenp commented 7 years ago

Very good point... It does work in what I've experienced... Let me run some tests and figure out how that actually happens.

galenp commented 7 years ago

Ok the below service uses the NES PollingClient to poll the eventstore and any events are saved to disk. It saves the checkpoint as a string in a text file. When the text file is missing it returns NULL.

I ran this below with a NULL checkpoint and got the desired results.

PollingClient Gist

The first RavenCommit == the first event image

The call to RavenDb by PollingClient image

First event picked up image

Events simply saved to disk image

IharBury commented 7 years ago

Aha, it makes sense now. I see NEventStore.Persistence.RavenDB does paged queries with Skip and Take: https://github.com/NEventStore/NEventStore.Persistence.RavenDB/blob/master/src/NEventStore.Persistence.RavenDB/RavenPersistenceEngine.cs#L184

The query is ordered by the checkpoint number. However RavenDB apparently keeps the order for transactions with the same zero checkpoint number.

But even in that case, if transaction processing is stopped when not all the zero checkpoint commits have been processed, it's not possible to continue processing from the next commit.

Also support for this is RavenDB specific. NEventStore.Persistence.Sql has internal limit of one page per database request here: https://github.com/NEventStore/NEventStore.Persistence.SQL/blob/master/src/NEventStore.Persistence.Sql/SqlPersistenceEngine.cs#L305. It will skip all the zero checkpoint transactions other than from the first page.

To support zero checkpoint transactions in possible scenarios we will have to make zero checkpoint a special case everywhere. We should load transactions from zero checkpoint as enumeration of unlimited length until we get a non-zero checkpoint transaction. We should cache all the zero checkpoint transactions separately as a collection. All the projectors should also be aware of zero checkpoint commits. When projecting stops and continues and the last commit was zero checkpoint the entire projecting should restart (after deleting all the projections).

galenp commented 7 years ago

To support zero checkpoint transactions in possible scenarios we will have to make zero checkpoint a special case everywhere. We should load transactions from zero checkpoint as enumeration of unlimited length until we get a non-zero checkpoint transaction. We should cache all the zero checkpoint transactions separately as a collection. All the projectors should also be aware of zero checkpoint commits. When projecting stops and continues and the last commit was zero checkpoint the entire projecting should restart (after deleting all the projections).

This seems to make sense to me. Obviously it's not that pretty for a 'legacy' problem.

I guess the merits of such a change vs the work required in migrating an event store with a number of commits that have 0 checkpoint through a process where they get populated into a new NES with a checkpoint. Considering raven does support this retrieval in an ordered manner it shouldn't be too difficult.

Maybe if the change you are suggesting is too broad for this "edge case" then this recourse might be better?

In short; if its easy do it... if its in anyway problematic perhaps the policy is to fix the event store first?

dennisdoomen commented 7 years ago

In short; if its easy do it... if its in anyway problematic perhaps the policy is to fix the event store first?

So it seems we can't really fix this easily. We need that checkpoint to allow us to cache the internals. I suggest you solve the problem on your side first.