basho / riak_pipe

Riak Pipelines
Apache License 2.0
162 stars 60 forks source link

Don't trigger archive behavior until handoff ?FOLD_REQ is received #40

Closed beerriot closed 12 years ago

beerriot commented 12 years ago

Handoff startup is two steps: Module:handoff_starting, followed by receiving a ?FOLD_REQ message. It is possible for the vnode to receive messages from its workers and other processes between these two steps (thanks, Joe). We don't want to do handoff things until after the second message arrives, though.

This bug was detected by the dynamic cluster + MapReduce test. It was possible for a vnode to have its handoff_starting function called, then receive a next_input request from one of its workers. The vnode, thinking it was in handoff, would tell the worker to archive. If the worker also finished and sent its archive back to the vnode before the ?FOLD_REQ message arrived, a bad_record error would be raised in riak_pipe_vnode:archive_internal/2, because #state.handoff was the atom starting, as set by the handoff_starting function, instead of a #handoff{} record, as it is after the ?FOLD_REQ message is received. This fix prevents the whole mess by maintaining normal, non-handoff operation until ?FOLD_REQ is received (so the response to next_input is the worker's next input, instead of the archive command).

rzezeski commented 12 years ago

This was merged via https://github.com/basho/riak_pipe/pull/41