stanford-futuredata gavel issues

stanford-futuredata / gavel

Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020

MIT License

125 stars 31 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Add GavelIterator to regulate number of steps

#147 santhnm2 closed 4 years ago
0
Fix bugs in AlloX policy

#146 deepakn94 closed 4 years ago
2
Add delay matrix to q (to prioritize jobs that came in earlier)

#145 deepakn94 closed 4 years ago
0
Parse outputs on workers and set local rank argument

#144 santhnm2 closed 4 years ago
0
Refactor isolated throughput computation to separate method

#143 deepakn94 closed 4 years ago
0
Update profiler to use existing scheduler infra and support distributed profiling

#142 santhnm2 closed 4 years ago
0
AlloX policy

#141 deepakn94 closed 4 years ago
3
Isolated policy that does not use cvxpy

#140 deepakn94 closed 4 years ago
2
Use some worker type's throughputs for all worker types in base finish time fairness policy

#139 deepakn94 closed 4 years ago
0
Fix bug in finish-time-fairness policy

#138 deepakn94 closed 4 years ago
0
Support for new finish-time-fairness policy in scheduler

#137 deepakn94 closed 4 years ago
0
Normalize throughputs in max-min fairness policy by isolated throughput (instead of throughput sum)

#136 deepakn94 closed 4 years ago
4
Finish-time fairness policy

#135 deepakn94 closed 4 years ago
3
Refactor out common "base" constraints into separate methods in `Policy` and `PolicyWithPacking`

#134 deepakn94 closed 4 years ago
0
Support running on a physical cluster

#133 santhnm2 closed 4 years ago
0
Re-organize policies into separate files

#132 santhnm2 closed 4 years ago
0
Packed cost policy

#131 deepakn94 closed 4 years ago
0
Use a single priority queue shared between all worker types

#130 santhnm2 closed 4 years ago
0
First pass at policies with costs

#129 deepakn94 closed 4 years ago
0
Convert job-job_type allocation back to job-job allocation

#128 santhnm2 closed 4 years ago
0
New allocation format

#127 deepakn94 closed 4 years ago
0
[WIP] Vectorize the MaxMinFairnessPolicyWithPacking policy

#126 santhnm2 closed 4 years ago
0
Add command line option to select solver

#125 santhnm2 closed 4 years ago
0
Make sorting of Job ID pairs determinstic

#124 santhnm2 closed 4 years ago
0
Computing allocation using application throughputs

#123 santhnm2 closed 4 years ago
0
WIP improving solver scalability

#122 santhnm2 closed 4 years ago
0
Use ECOS solver

#121 santhnm2 closed 5 years ago
0
Remove references to gavel directory

#120 santhnm2 closed 5 years ago
0
Index relevant job combinations when computing allocation for packing policies

#119 santhnm2 closed 5 years ago
0
Beginning to add functionality for pruning and warm start

#118 santhnm2 closed 4 years ago
0
Re-organize scripts directory

#117 santhnm2 closed 5 years ago
0
Added script to parse throughput estimation sweep log

#116 santhnm2 closed 5 years ago
0
Update Philly simulation traces

#115 deepakn94 closed 5 years ago
0
Update Philly and throughput_estimation graphs

#114 deepakn94 closed 5 years ago
0
Update figures with latest runs

#113 deepakn94 closed 5 years ago
0
Background figures for Philly

#112 deepakn94 closed 5 years ago
0
Physical cluster with throughput estimation

#111 santhnm2 closed 4 years ago
0
Aesthetic improvements to graphs, and logic to store data used to produce graphs in pickle file

#110 deepakn94 closed 5 years ago
0
Add checkpointing functionality to workloads

#109 santhnm2 closed 5 years ago
0
Notebook with time_per_iteration sweep results

#108 deepakn94 closed 5 years ago
0
Script to compute scaling of policies

#107 deepakn94 closed 5 years ago
0
Fairness with priorities notebook

#106 deepakn94 closed 5 years ago
0
Changes for physical cluster experiments

#105 deepakn94 closed 4 years ago
1
[WIP] Throughput estimation notebook and figures

#104 santhnm2 closed 5 years ago
0
Update throughput estimation methodology to profile against reference models

#103 santhnm2 closed 5 years ago
2
Only compute average job completion time when list is not empty

#102 deepakn94 closed 5 years ago
1
Fixes to policies for multi-GPU jobs

#101 deepakn94 closed 5 years ago
0
Fix bug with throughputs being initialized with V100 throughputs

#100 santhnm2 closed 5 years ago
0
Latest evaluation figures including PDFs to add to paper

#99 deepakn94 closed 5 years ago
0
Add DistributedDataParallel support to 4 of 7 workload types

#98 deepakn94 closed 5 years ago
0

Previous Next