Closed cizmazia closed 9 years ago
There's no guarantee the the T
in Lazy<T>
is Serializable
. Plus, if you serialize and then deserialize an instance of Lazy<T>
which never fetched the backing T
, how is it supposed to fetch that value without being bound to the backing graph?
I use the following pattern with Guava Suppliers (for variables used in lambdas which are serialized and executed by Apache Spark in different processes):
final Supplier<MyObj> supplier = (Serializable & Supplier<MyObj>) () -> new MyObj();
final Supplier<MyObj> lazy = Suppliers.memoize(supplier);
It seems that Lazy can be implemented in the same way as MemoizingSupplier. In my use case, this would eliminate the boilerplate code above.
Intended use:
class SparkComputation {
@Inject Lazy<MyObj> lazy;
public void run() {
// The lambda is executed in different processes
stream.map(x -> lazy.get().convert(x));
}
}
MyObj
is not required to be Serializable
. Binding<T> delegate
in LazyBinding
would need to be Serializable
. Would it possible to achieve that if all the upstream graph dependencies were injected as Lazy
?
We can't support this in a reasonable way. It breaks many assumptions and I don't think your cute sample is worth the hidden complexities it carries.
One simple assumption: if any dependency of the lazy is a stateful singleton, do we get two instances upon deserialization?
For this kind of use cases, that is exactly the expected guarantee: one singleton instance per process.
It is not about my cute sample. As far as I can tell this a simplest approach for dependency injection into lambdas which are being distributed into multiple Apache Spark workers. Any pointers appreciated.
Is there a reason for not making
Lazy
to beSerializable
in the same way as GuavaSuppliers
?This would make it straightforward to use with frameworks which serialize lambdas (with lazy dependencies) to execute them in a different process.