Closed akaigoro closed 9 years ago
Sure, the dining philosophers issue, i.e., a deadlock or a livelock, can occur in any system where multiple workers are accessing shared resources concurrently. For example, a pipeline containing a single DoFn
that may acquire two locks in a random order could run into this issue. The "locks" are assumed to be global to the Dataflow worker VMs.
I mean, is it possible to write correct implementation using DtataflowJavaSDK?
Sure, there are several approaches that can result in "correct implementations". Resource hierarchy solution requires acquiring locks in a partial order and doesn't require any particular support from the system itself.
Other solutions may require some additional work. For example, arbitrator solution could be implemented by having Datataflow workers communicating with a global arbitrator, which can be built as a service outside Dataflow. Solutions which require communication between workers, like Chandy/Misra solution, are harder.
In principle, however, Dataflow programming model is higher layer and very different than a plain message-passing system where users coordinate work between workers. In Dataflow, workers shouldn't be talking to each other. Please check-out our programming model documentation. Perhaps it would be beneficial to focus on the underlying data-processing problem you would like to solve with Dataflow and how to express it using Dataflow principles, instead of an implementation detail of deadlocks during execution.
subj