alephcloud / hs-aws-kinesis-client

A producer/consumer client library for Kinesis
http://hackage.haskell.org/package/aws-kinesis-client
Apache License 2.0
5 stars 3 forks source link

Breaking changes: improved shutdown and error approach, queue abstraction #20

Closed jonsterling closed 9 years ago

jonsterling commented 9 years ago

This is intended to address #19, #17, #21, and #22.

There is more to be done before this may be merged.

jonsterling commented 9 years ago

@larskuhtz When you have a moment, would you mind taking a look at this pull request? It may be best to look mostly at the final diff, since things changed a bit along the way and each individual commit is perhaps not quite so informative.

The main changes here are as follows:

  1. The code that consumed the input queue in "chunks" (in order to batch things in PutRecords requests) was pretty bizarre and wrong; I don't think that it would have caused any bugs, but it probably didn't behave quite how I had hoped. I have replaced it with code that seems a lot more straightforward and actually appears to work properly (see takeTBMQueueWithTimeout and chunkSource). Perhaps there are still bugs in it which I have not observed, though.
  2. Previously pretty much everything was plumbed into MonadError; this caused a lot of bureaucracy and made things a lot more complicated & ornate than they needed to be. I've made the thing internally use normal exceptions, but exposed APIs returning a suitable error type.
  3. Shutdown is cleaner now. When the scope of continuation passed to withKinesisProducer exits, then the remaining messages are flushed out with a configurable timeout.
  4. A lot of internal stuff was using MonadBaseControl IO, etc., but I found that no expressivity had been gained over just using IO directly. So a number of things have been simplified in that respect, though the public APIs are all polymorphic over their monad.

Some things to look for in particular:

  1. Places where exceptions need to be caught but perhaps aren't; or things that need to be masked
  2. Policies for respawning things... For instance, what happens if the producer worker dies? Should it be restarted? Should this logic be handled here or in the client code?
larskuhtz commented 9 years ago

I'll take a look. I'll probably need some time to read through all of it.

Previously pretty much everything was plumbed into MonadError; this caused a lot of bureaucracy and made things a lot more complicated & ornate than they needed to be. I've made the thing internally use normal exceptions, but exposed APIs returning a suitable error type.

While working on similar issues in yet-another-logger I just made a few experiences that make me wonder if the apparent complexity of MonadError and the apparent simplicity of normal Exceptions is somewhat spurious. Maybe with MonadError certain edge cases are just explicit and one is forced to deal with them, while with normal exceptions it's easy to overlook certain subtleties in the semantics of throw and catch. The more I dig in the exceptions in a multithreaded setup the more I wonder if I really understand how they work.

jonsterling commented 9 years ago

I am going to merge this now. To summarize:

  1. The Producer is much improved in multiple respects. When its scope exits, it cleanly shuts down (flushing messages with a timeout); the old behavior was that the producer would never exit until its owning thread was killed.
  2. There is a bug in STM's TQueue which causes live-locking under heavy load; the bug was a strict reverse of the internal queue, which should have been lazy. The Producer now ships by default using a queue based on TBMChan, but it may be configured to use TBMQueue (or any other implementation), which user may wish to do once STM is patched, since TQueue has better throughput in general than TChan.
  3. Many things have been cleaned up & modularized. Exception handling has been adjusted such that exceptions are used internally where it makes sense, and APIs which expect you to handle errors are exposed returning m (Either SpecificErrorType a).
  4. The Consumer is not much changed; because it has not had the benefit of much focused work, there are still problems with it, but they date back to before this version. In other words, this version does not make the Consumer any worse.