issues
search
pentium3
/
sys_reading
system paper reading notes
235
stars
12
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Cilantro: Performance-Aware Resource Allocation for General Objectives via Online Feedback
#268
pentium3
opened
1 year ago
1
How To Get Your Research Adopted - Emery Berger PLDI 2022 keynote
#267
pentium3
opened
1 year ago
0
ABM: Active Buffer Management in Datacenters
#266
pentium3
closed
1 year ago
0
Yugong: Geo-Distributed Data and Job Placement at Scale
#265
pentium3
opened
1 year ago
0
SkyPilot: An Intercloud Broker for Sky Computing
#264
pentium3
opened
1 year ago
0
Gloss: Seamless Live Reconfiguration and Reoptimization of Stream Programs
#263
pentium3
opened
1 year ago
1
Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency
#262
pentium3
closed
1 year ago
4
Erms: Efficient Resource Management for Shared Microservices with SLA Guarantees
#261
pentium3
closed
2 months ago
2
Scaling a Declarative Cluster Manager Architecture with Query Optimization Techniques
#260
pentium3
closed
1 year ago
0
Big Model Tutorial Techniques and Systems to Train and Serve Bigger Models
#259
pentium3
closed
9 months ago
6
Remote Procedure Call as a Managed System Service
#258
pentium3
closed
9 months ago
0
Training 175B Parameter Language Models at 1000 GPU scale with Alpa and Ray
#257
pentium3
opened
1 year ago
1
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving
#256
pentium3
closed
8 months ago
2
Quasar: Resource-Efficient and QoS-Aware Cluster Management
#255
pentium3
closed
8 months ago
2
Saba: Rethinking Datacenter Network Allocation from Application’s Perspective
#254
pentium3
closed
8 months ago
0
Protego: Overload Control for Applications with Unpredictable Lock Contention
#253
pentium3
closed
8 months ago
0
LDB: An Efficient Latency Debugging Tool for Datacenter Applications
#252
pentium3
closed
9 months ago
0
High-throughput Generative Inference of Large Language Models with a Single GPU
#251
pentium3
opened
1 year ago
0
Latency-conscious dataflow reconfiguration
#250
pentium3
opened
1 year ago
0
Reshape: Adaptive Result-aware Skew Handling for Exploratory Analysis on Big Data
#249
pentium3
closed
8 months ago
0
Tuning the Tail Latency of Distributed Queries Using Replication
#248
pentium3
closed
8 months ago
0
Caerus: NIMBLE Task Scheduling for Serverless Analytics
#247
pentium3
opened
1 year ago
1
CrystalPerf: Learning to Characterize the Performance of Dataflow Computation through Code Analysis
#246
pentium3
closed
8 months ago
1
Managing and understanding distributed stream processing
#245
pentium3
closed
8 months ago
0
Better Together: Jointly Optimizing ML Collective Scheduling and Execution Planning using SYNDICATE
#244
pentium3
opened
1 year ago
0
SHEPHERD: Serving DNNs in the Wild
#243
pentium3
opened
1 year ago
1
TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for Distributed Training Jobs
#242
pentium3
opened
1 year ago
0
On Modular Learning of Distributed Systems for Predicting End-to-End Latency
#241
pentium3
opened
1 year ago
0
Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training
#240
pentium3
opened
1 year ago
0
NetRPC: Enabling In-Network Computation in Remote Procedure Calls
#239
pentium3
closed
1 year ago
0
Unlocking unallocated cloud capacity for long, uninterruptible workloads
#238
pentium3
opened
1 year ago
0
Understanding Host Network Stack Overheads
#237
pentium3
closed
1 year ago
0
Cosine: A Cloud-Cost Optimized Self-Designing Key-Value Storage Engine
#236
pentium3
closed
1 year ago
0
Samza: Stateful Scalable Stream Processing at LinkedIn
#235
pentium3
opened
1 year ago
0
Making Sense of Performance in Data Analytics Frameworks
#234
pentium3
opened
1 year ago
0
The Power of Choice in Data-Aware Cluster Scheduling
#233
pentium3
closed
8 months ago
0
Graphene: Packing and Dependency-aware Scheduling for Data-Parallel Clusters
#232
pentium3
closed
8 months ago
0
Henge: Intent-driven Multi-Tenant Stream Processing
#231
pentium3
closed
8 months ago
1
Fries: Fast and Consistent Runtime Reconfiguration in Dataflow Systems with Transactional Guarantees
#230
pentium3
closed
8 months ago
0
Streaming Analytics with Adaptive Near-data Processing
#229
pentium3
closed
8 months ago
0
Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
#228
pentium3
closed
8 months ago
6
ElasticFlow: An Elastic Serverless Training Platform for Distributed Deep Learning
#227
pentium3
opened
1 year ago
1
Meces: Latency-efficient Rescaling via Prioritized State Migration for Stateful Distributed Stream Processing Systems
#226
pentium3
opened
1 year ago
0
Rhino: Efficient management of very large distributed state for stream processing engines
#225
pentium3
opened
1 year ago
2
Serving DNNs like Clockwork: Performance Predictability from the Bottom Up
#224
pentium3
opened
1 year ago
0
MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud
#223
pentium3
opened
1 year ago
0
Learning Scheduling Algorithms for Data Processing Clusters
#222
pentium3
closed
8 months ago
1
Lessons Learned from the Chameleon Testbed
#221
pentium3
closed
1 year ago
0
Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning
#220
pentium3
opened
1 year ago
1
Autothrottle: A Practical Framework for Harvesting CPUs from SLO-Targeted Microservices
#219
pentium3
closed
1 year ago
0
Previous
Next