issues
search
stanford-futuredata
/
gavel
Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020
MIT License
125
stars
31
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Set throughput mask only when estimating throughputs
#97
deepakn94
closed
5 years ago
0
Philly trace sweep script
#96
deepakn94
closed
5 years ago
0
Estimate throughputs online
#95
santhnm2
closed
5 years ago
0
New max throughput_sum and max_min_fairness w/ priorities policies
#94
deepakn94
closed
5 years ago
1
Remove dead code for updating colocated job steps
#93
santhnm2
closed
5 years ago
0
Checkpoint scheduler state if `checkpoint_file` and `checkpoint_threshold` is set
#92
deepakn94
closed
5 years ago
1
Add logging code to models to facilitate throughput estimation
#91
santhnm2
closed
5 years ago
0
Remove allocation rounding
#90
santhnm2
closed
5 years ago
0
Set co-located throughputs to 0 if jobs have different scale factors …
#89
deepakn94
closed
5 years ago
0
Add Philly traces
#88
santhnm2
closed
5 years ago
0
Remove measurement_window from simulate() logic in scheduler.py
#87
deepakn94
closed
5 years ago
0
Remove non round-based scheduling mechanism
#86
deepakn94
closed
5 years ago
0
Added new notebook for FIFO motivation figure
#85
santhnm2
closed
5 years ago
1
Remove ideal simulation since it's unused
#84
deepakn94
closed
5 years ago
0
Simulate steady state by adding initial jobs for each worker
#83
santhnm2
closed
5 years ago
0
Add figure illustrating varying model affinity for heterogeneous hardware types
#82
santhnm2
closed
5 years ago
0
Factor out cutoff_throughputs into separate JSON file
#81
deepakn94
closed
5 years ago
0
Remove ratios from command line args, and pass in cluster_specs directly
#80
deepakn94
closed
5 years ago
0
Don't schedule jobs with zero priority only for FIFO policies
#79
deepakn94
closed
5 years ago
0
Minor fixes for sweep script arguments specifying cluster size
#78
santhnm2
closed
5 years ago
0
Disallow zero priority jobs from being scheduled
#77
santhnm2
closed
5 years ago
0
Add oracle throughputs for all model configurations
#76
santhnm2
closed
5 years ago
1
Modify _v2 to support scale_factor > 1
#75
deepakn94
closed
5 years ago
0
Remove un-needed code in _v2 scheduler
#74
deepakn94
closed
5 years ago
0
Added notebook to generate colocated throughput heatmap figure
#73
santhnm2
closed
5 years ago
0
Don't use map which pre-shards, leading to load imbalance across threads
#72
deepakn94
closed
5 years ago
2
[WIP] Least attained service policies
#71
deepakn94
closed
5 years ago
0
Update main fairness notebook with latest results
#70
deepakn94
closed
5 years ago
0
Script to sweep total_num_jobs for traces where all jobs are launched at the start of the trace
#69
deepakn94
closed
5 years ago
0
[WIP] Integrate online throughput estimation into the emulator
#68
santhnm2
closed
5 years ago
1
Added cutoff throughputs for each cluster ratio/policy combination
#67
santhnm2
closed
5 years ago
0
Factor out the get_policy function and add an argument for random seed
#66
santhnm2
closed
5 years ago
0
Initialize the base FIFO policy with the correct random seed
#65
santhnm2
closed
5 years ago
1
Compute number of steps each job should run for on the fly
#64
santhnm2
closed
5 years ago
0
Optimizations for MaxMinFairnessPolicyWithPacking
#63
santhnm2
closed
5 years ago
0
Factor out the `get_policy` function
#62
santhnm2
closed
5 years ago
1
Added script for emulating a single configuration
#61
santhnm2
closed
5 years ago
0
Print additional information when running sweep
#60
santhnm2
closed
5 years ago
0
Renamed Isolated and MaxMinFairness policies
#59
santhnm2
closed
5 years ago
4
Fix bugs in timeline plotting, and output job rate computation
#58
deepakn94
closed
5 years ago
0
Rename isolated policy
#57
santhnm2
closed
5 years ago
1
Added P100 throughputs
#56
santhnm2
closed
5 years ago
2
Add command line flag for generating jobs with fixed duration
#55
santhnm2
closed
5 years ago
0
Pipe output to file instead of writing it at the end
#54
santhnm2
closed
5 years ago
0
Update notebook with latest runs
#53
deepakn94
closed
5 years ago
0
Prevent job from receiving infinite priority after reset
#52
santhnm2
closed
5 years ago
0
Added FIFO with packing policy
#51
santhnm2
closed
5 years ago
0
Misc performance optimizations and bug fixes
#50
santhnm2
closed
5 years ago
0
Print warning message when all GPUs are not used
#49
deepakn94
closed
5 years ago
0
More comprehensive logging in scheduler.py
#48
deepakn94
closed
5 years ago
0
Previous
Next