Implement a control mechanism to prevent two macrotasks from loading or computing the same block at the same time.

NetSys / spark-monotasks

Fast, predictable data analytics based on (and API-compatible with) Apache Spark

Apache License 2.0

25 stars 18 forks source link

Implement a control mechanism to prevent two macrotasks from loading or computing the same block at the same time. #14

Open ccanel opened 9 years ago

ccanel commented 9 years ago

Before the new disk scheduler was integrated into the rest of the Spark code, the CacheManager was responsible for making sure that two tasks did not compute the same block at the same time (thus doing duplicate work). Given that the CacheManager has been removed, we need to reimplement this functionality somewhere else.