joapolarbear / dl_notes

1 stars 1 forks source link

Straggler Detection in Parallel Computing Systems through Dynamic Threshold Calculation #2

Open joapolarbear opened 4 years ago

joapolarbear commented 4 years ago

IEEE2016 PDF

Target

  1. Create fewer replicas for Speculative execution (which create task replicas at runtime, is a typical method deployed in large-scale distributed systems to tolerate stragglers)
  2. Improve resource utilization
  3. Reduce response time

Approach

An algorithm for dynamically calculating a threshold value to identify task stragglers, considering key parameters including job QoS timing constraints, task execution characteristics, and optimal system resource utilization.