issues
search
KuangjuX
/
Paper-reading
My Paper Reading Lists and Notes.
15
stars
1
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add Paper Record for GLA and BigBird.
#35
KuangjuX
closed
4 days ago
0
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
#34
KuangjuX
opened
9 months ago
0
FLASHATTENTION: Fast and Memory-Efficient Exact Attention with IO-Awareness
#33
KuangjuX
closed
6 months ago
0
Attention is all you need
#32
KuangjuX
closed
9 months ago
0
AMOS: Enabling Automatic Mapping for Tensor Computations On Spatial Accelerators with Hardware Abstraction
#31
KuangjuX
closed
9 months ago
0
Chimera: An Analytical Optimizing Framework for Effective Compute-intensive Operators Fusion
#30
KuangjuX
opened
10 months ago
0
PockEngine: Sparse and Efficient Fine-tuning in a Pocket
#29
KuangjuX
closed
6 months ago
0
BOLT: BRIDGING THE GAP BETWEEN AUTO-TUNERS AND HARDWARE-NATIVE PERFORMANCE
#28
KuangjuX
closed
11 months ago
0
Graphene: An IR for Optimized Tensor Computations on GPUs
#27
KuangjuX
closed
11 months ago
0
AStitch: Enabling a New Multi-dimensional Optimization Space for Memory-Intensive ML Training and Inference on Modern SIMT Architectures
#26
KuangjuX
closed
11 months ago
1
Welder: Scheduling Deep Learning Memory Access via Tile-graph
#25
KuangjuX
closed
12 months ago
1
Roller: Fast and Efficient Tensor Compilation for Deep Learning
#24
KuangjuX
closed
11 months ago
1
The Tensor Algebra Compiler
#23
KuangjuX
opened
1 year ago
0
Rammer: Enabling Holistic Deep Learning Compiler Optimizations with rTasks
#22
KuangjuX
closed
1 year ago
1
Cocktailer: Analyzing and Optimizing Dynamic Control Flow in Deep Learning
#21
KuangjuX
closed
11 months ago
1
Pushing the Limits of Machine Design: Automated CPU Design with AI
#20
KuangjuX
closed
1 year ago
0
System Virtualization for Neural Processing Units
#19
KuangjuX
closed
1 year ago
0
Fast Segment Anything
#18
KuangjuX
closed
10 months ago
0
Microsecond-scale Preemption for Concurrent GPU-accelerated DNN Inferences
#17
KuangjuX
opened
1 year ago
1
MLIR: Scaling Compiler Infrastructure for Domain Specific Computation
#16
KuangjuX
closed
11 months ago
0
The Deep Learning Compiler: A Comprehensive Survey
#15
KuangjuX
closed
1 year ago
0
Keystone: an open framework for architecting trusted execution environments
#14
KuangjuX
closed
10 months ago
0
Firecracker: Lightweight Virtualization for Serverless Applications
#13
KuangjuX
closed
10 months ago
0
SkyBridge: Fast and Secure Inter-Process Communication for Microkernels
#12
KuangjuX
closed
10 months ago
0
Ray: A Distributed Framework for Emerging AI Applications
#11
KuangjuX
opened
1 year ago
0
Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
#10
KuangjuX
opened
1 year ago
0
Unikraft: Fast, Specialized Unikernels the Easy Way
#9
KuangjuX
closed
1 year ago
0
The benefits and costs of writing a POSIX kernel in a high-level language
#8
KuangjuX
closed
10 months ago
0
Every Walk’s a Hit: Making Page Walks Single-Access Cache Hits
#7
KuangjuX
closed
10 months ago
0
The Demikernel Datapath OS Architecture for Microsecond-scale Datacenter Systems
#6
KuangjuX
closed
10 months ago
0
RedLeaf: Isolation and Communication in a Safe Operating System
#5
KuangjuX
closed
10 months ago
0
Fast Distributed Deep Learning over RDMA
#4
KuangjuX
closed
10 months ago
0
XPC: Architectural Support for Secure and Efficient Cross Process Call
#3
KuangjuX
closed
10 months ago
0
Theseus: an Experiment in Operating System Structure and State Management
#2
KuangjuX
closed
10 months ago
0
Nephele: Extending Virtualization Environments for Cloning Unikernel-based VMs
#1
KuangjuX
closed
10 months ago
0