issues
search
stanford-futuredata
/
gavel
Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020
MIT License
125
stars
31
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add GavelIterator to regulate number of steps
#147
santhnm2
closed
4 years ago
0
Fix bugs in AlloX policy
#146
deepakn94
closed
4 years ago
2
Add delay matrix to q (to prioritize jobs that came in earlier)
#145
deepakn94
closed
4 years ago
0
Parse outputs on workers and set local rank argument
#144
santhnm2
closed
4 years ago
0
Refactor isolated throughput computation to separate method
#143
deepakn94
closed
4 years ago
0
Update profiler to use existing scheduler infra and support distributed profiling
#142
santhnm2
closed
4 years ago
0
AlloX policy
#141
deepakn94
closed
4 years ago
3
Isolated policy that does not use cvxpy
#140
deepakn94
closed
4 years ago
2
Use some worker type's throughputs for all worker types in base finish time fairness policy
#139
deepakn94
closed
4 years ago
0
Fix bug in finish-time-fairness policy
#138
deepakn94
closed
4 years ago
0
Support for new finish-time-fairness policy in scheduler
#137
deepakn94
closed
4 years ago
0
Normalize throughputs in max-min fairness policy by isolated throughput (instead of throughput sum)
#136
deepakn94
closed
4 years ago
4
Finish-time fairness policy
#135
deepakn94
closed
4 years ago
3
Refactor out common "base" constraints into separate methods in `Policy` and `PolicyWithPacking`
#134
deepakn94
closed
4 years ago
0
Support running on a physical cluster
#133
santhnm2
closed
4 years ago
0
Re-organize policies into separate files
#132
santhnm2
closed
4 years ago
0
Packed cost policy
#131
deepakn94
closed
4 years ago
0
Use a single priority queue shared between all worker types
#130
santhnm2
closed
4 years ago
0
First pass at policies with costs
#129
deepakn94
closed
4 years ago
0
Convert job-job_type allocation back to job-job allocation
#128
santhnm2
closed
4 years ago
0
New allocation format
#127
deepakn94
closed
4 years ago
0
[WIP] Vectorize the MaxMinFairnessPolicyWithPacking policy
#126
santhnm2
closed
4 years ago
0
Add command line option to select solver
#125
santhnm2
closed
4 years ago
0
Make sorting of Job ID pairs determinstic
#124
santhnm2
closed
4 years ago
0
Computing allocation using application throughputs
#123
santhnm2
closed
4 years ago
0
WIP improving solver scalability
#122
santhnm2
closed
4 years ago
0
Use ECOS solver
#121
santhnm2
closed
5 years ago
0
Remove references to gavel directory
#120
santhnm2
closed
5 years ago
0
Index relevant job combinations when computing allocation for packing policies
#119
santhnm2
closed
5 years ago
0
Beginning to add functionality for pruning and warm start
#118
santhnm2
closed
4 years ago
0
Re-organize scripts directory
#117
santhnm2
closed
5 years ago
0
Added script to parse throughput estimation sweep log
#116
santhnm2
closed
5 years ago
0
Update Philly simulation traces
#115
deepakn94
closed
5 years ago
0
Update Philly and throughput_estimation graphs
#114
deepakn94
closed
5 years ago
0
Update figures with latest runs
#113
deepakn94
closed
5 years ago
0
Background figures for Philly
#112
deepakn94
closed
5 years ago
0
Physical cluster with throughput estimation
#111
santhnm2
closed
4 years ago
0
Aesthetic improvements to graphs, and logic to store data used to produce graphs in pickle file
#110
deepakn94
closed
5 years ago
0
Add checkpointing functionality to workloads
#109
santhnm2
closed
5 years ago
0
Notebook with time_per_iteration sweep results
#108
deepakn94
closed
5 years ago
0
Script to compute scaling of policies
#107
deepakn94
closed
5 years ago
0
Fairness with priorities notebook
#106
deepakn94
closed
5 years ago
0
Changes for physical cluster experiments
#105
deepakn94
closed
4 years ago
1
[WIP] Throughput estimation notebook and figures
#104
santhnm2
closed
5 years ago
0
Update throughput estimation methodology to profile against reference models
#103
santhnm2
closed
5 years ago
2
Only compute average job completion time when list is not empty
#102
deepakn94
closed
5 years ago
1
Fixes to policies for multi-GPU jobs
#101
deepakn94
closed
5 years ago
0
Fix bug with throughputs being initialized with V100 throughputs
#100
santhnm2
closed
5 years ago
0
Latest evaluation figures including PDFs to add to paper
#99
deepakn94
closed
5 years ago
0
Add DistributedDataParallel support to 4 of 7 workload types
#98
deepakn94
closed
5 years ago
0
Previous
Next