cashapp / backfila

Service that manages backfill state, calling into other services to do batched work
https://cashapp.github.io/backfila/
Apache License 2.0
30 stars 48 forks source link

support pipelining data from one service to another #11

Open shellderp opened 4 years ago

shellderp commented 4 years ago
shellderp commented 3 years ago

Some notes

Backfila cross service

UI

Changes to the Create form: After selecting backfill, if it has a output type, add a selector for target service, then after selecting service, request backfills for that service, and allow selecting a backfill only with matching input type

Backfill gets created as normal but adds target_registered_backfill_id (this column already exists). Since the target just receives data and otherwise has no state, we just store the classname of it and no extra state (ie just the registered_backfill_id)

fill in proto pipelined_data (already exists) from the runBatch response and put into runBatch request for the target

Receiver

For target, can use a entirely new interface that's bound with another Operator.

eg:

ReceiverBackfill
abstract fun runBatch(data: bytes)
abstract fun supportedTypes(): Set<Class<*>>

ProtoReceiverBackfill<M : Message> : ReceiverBackfill
abstract fun runBatch(messages: List<M>)
fun typeName(): String = M::class.name

Publisher

biggest question: How do we provide an interface Backfill that can return pipeline data and one that can receive it?

for the sender, we still need all the batching mechanisms of eg hibernate, so we should extend the HibernateBackfill concept we can either

  1. a method to context: context.send_pipeline(bytes) but then how is the type indicated at registration time?

  2. a new method that can return data, plus a method that returns the type just have an optional interface that is checked for reflectively? gross..

  3. pull the batching logic out of the operator so you can extend either ProducingHibernateBackfill or just HibernateBackfill

  4. have a ProducingHibernateBackfill<T> : HibernateBackfill where the runBatch method is implemented to do this pseudocode:

    
    final runBatch(list) {
    bytes = list.map { produce(it) }
    context.send(bytes)
    }

abstract fun produce(pkey): T


maybe bind `ProducingHibernateBackfill` as another operator??