jackfirth / rebellion

A collection of core libraries for Racket
https://pkgs.racket-lang.org/package/rebellion
Apache License 2.0
80 stars 15 forks source link

Throttling transducers #349

Closed jackfirth closed 3 months ago

jackfirth commented 4 years ago

Would it be useful to have a way to limit how many emissions a transducer can produce for each item? Either by delaying/buffering them or by dropping them entirely. Example:

> (transduce (list "big" "red")
             (observing-transduction-events
              (throttling (append-mapping in-string)))
             #:into into-list)
(list start-event
      (consume-event "big")
      (emit-event #\b)
      (consume-event "red")
      (emit-event #\i)
      half-close-event
      (half-closed-emit-event #\g)
      (half-closed-emit-event #\r)
      (half-closed-emit-event #\e)
      (half-closed-emit-event #\d)
      finish-event)
jackfirth commented 4 years ago

Open questions:

  1. Why? I had the idea while thinking about #347 and how an infinitely-emitting subtransducer would starve other transducers of the chance to consume items. This doesn't really fix that, but it's a related idea. I don't have any use cases for this, I just wanted to write it down for posterity.

  2. Should emits and consumes only be one-to-one? Maybe restricting it to a user-supplied ratio would be better, like (throttling t 5) allows t to emit up to 5 items before throttling kicks in.

  3. Where should throttled emissions go? Should they be buffered and delayed (as in the example above), or should they just be dropped entirely?

jackfirth commented 4 years ago

I came up with an actual use case for this: using a transducer to perform an in-place transformation of a mutable sequence, such as a resizable vector. Example:

(define words (make-resizable-vector "hello world" "how are you"))
(resizable-vector-transduce! words (append-mapping string-split))
;; words now contains "hello", "world", "how", "are", and "you"

This requires throttling because outputs are written into the vector as they're emitted. If the transducer emits more outputs than the number of inputs its consumed so far, then naively writing them would overwrite some inputs that haven't been consumed yet. But a throttling transducer would ensure that excess outputs are buffered and delayed until enough inputs have been consumed that the outputs can be written back into the vector.

This isn't a significant enough use case on its own to justify throttling though. Writing it down for posterity.