mhart / kinesalite

An implementation of Amazon's Kinesis built on LevelDB
MIT License
808 stars 86 forks source link

random errors in kinesalite using NextShardIterator #25

Closed jriquelme closed 8 years ago

jriquelme commented 8 years ago

Hi, we use kinesalite to run our integration tests locally. We're having problems with the iterator returned in the GetRecords response (NextShardIterator value). Our code fails randomly (apparently) with kinesalite, although runs perfectly in kinesis.

The following gist contains a go program (using https://github.com/aws/aws-sdk-go), that reproduces the problem, with the output of a typical successful and failure execution.

https://gist.github.com/jriquelme/144816586e3f421d00af

BTW, thanks for developing kinesalite, it's very useful to us.

mhart commented 8 years ago

Hmmm, I wonder if I'm being overly aggressive in my checks for valid ShardIterators.

I won't have time to get your code up and running today, but might be able to tomorrow.

If you want to poke around yourself, I'm pretty sure it's one of these invalidShardIterator checks that's failing: https://github.com/mhart/kinesalite/blob/f30030a6d660f3420299220f57d42f4b9be815f3/actions/getRecords.js#L11-L51

You could just sprinkle around some console.logs to see if you can find anything.

If not, I should have time to look at it in the next couple of days

mhart commented 8 years ago

Yeah, as I thought – stupid me forcing a check that the shard iterator creation time has to be before the current time in milliseconds, whereas of course it could be equal.

Took me a few tries to reproduce, as it's clearly a race condition – I haven't been able to reproduce it after this fix, but if you can, please reopen.

Fixed and published as v1.11.3 – thanks again for the report!

jriquelme commented 8 years ago

Seems to be working correctly now, thank you @mhart