colinsurprenant / redstorm

JRuby on Storm
Other
298 stars 56 forks source link

Update on_receive block to handle yield statements #65

Open stefan-pdx opened 11 years ago

stefan-pdx commented 11 years ago

on_receive method blocks should be able to yield a tuple, which would in turn get emitted. Right now, it looks like the current implementation will iterator through an array of tuple and emit them, however, aggregating them inside of the on_receive block might yield poor performance.

Example:

(current form)

class DemultiplexerBolt < RedStorm::SimpleBolt
  on_receive do |tuple|
    tuples = []
    tuple.getValueByField(:ids).each do |id|
      tuples << tuple + [id] 
    end
    tuples
  end
end

(proposed form)

class DemultiplexerBolt < RedStorm::SimpleBolt
  on_receive do |tuple|
    tuple.getValueByField(:ids).each do |id|
      yield tuple + [id] 
    end
  end
end

Does anyone see any issues with updating the logic within SimpleBolt#execute to account for this?

colinsurprenant commented 11 years ago

you can actually avoid the auto emit functionality by setting the :emit => false options to on_receive and "manually" call unanchored_emit or anchored_emit to emit your tuple from within the on_receive block.

example

class DemultiplexerBolt < RedStorm::SimpleBolt
  on_receive :emit => false do |tuple|
    tuple.getValueByField(:ids).each do |id|
      unanchored_emit(tuple + [id])
    end
  end
end

does this work for you?

I do not really see how yielding tuples would significantly improve performance. What's your thought on this?

colinsurprenant commented 11 years ago

please let me know you thoughts on my last comment. I would like to release a new redstorm version this week.