Closed mbande closed 7 years ago
Can you please specify more about "expiration" and how do you process tuples?
Apache Storm (and streamparse) works like: Spout emits some new data (tuples) into topology and series of bolts will do their job of transforming the data. There is no space for "expiration".
Maybe you are talking about "failing" (NACKING) the tuple.
One way is to use for example RabbitMQ from which Spout is reading and when there is problem in topology - eg. some Bolt can't process the data, they will fail(NACK) the tuple. From here also the Spout fail the tuple and send it back to RabbitMQ (either with configured deadletter exchange or post into another exchange).
see http://streamparse.readthedocs.io/en/master/api.html#streamparse.Bolt.fail
i'm looking for a way to expire tuples when they are waiting for a bolt to process them. when incoming input tuples rate is higher than bolt processing rate, there is some sort of queue that holds tuples until a bolt take them, right? i want to expire waiting tuples after some time,
I don't know if there is some expiration option based on waiting time. But I think there are 3 ways to deal with this. (Ideally combined together.)
1) There is an option "topology.message.timeout.secs" but it is the time from emitting tuple into topology to NACKing the tuple.
2) Use "topology.max.spout.pending" option. This option limits number of tuples emitted into topology. But you need to find optimal value for this option by yourself. So you don't have too much / too less tuples in topology
3) Use back-pressure mechanism. Which should limit number of tuples in topology based on some metrics (read more about this in another article - sorry for external link but the topic is too vast)
thank you @Darkless012
i need tuples in streams to expire if there is no task to process them for a while is there any way to implement this in storm concepts?