issues
search
yukarinoki
/
reseach
0
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
ユニバーサル言語を作りたい
#50
yukarinoki
opened
6 days ago
5
welderを読もうかな
#49
yukarinoki
opened
2 weeks ago
3
polygeistのコードベースを確認する
#46
yukarinoki
opened
1 month ago
8
retargetingの実行をしてみる
#45
yukarinoki
opened
1 month ago
5
Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimization Pragmas Using Bayesian Optimization
#44
yukarinoki
opened
1 month ago
9
Graphene: An IR for Optimized Tensor Computations on GPUs
#42
yukarinoki
opened
3 months ago
1
Should AI Optimize Your Code? A Comparative Study of Current Large Language Models Versus Classical Optimizing Compilers
#41
yukarinoki
opened
3 months ago
1
MLIRについて
#40
yukarinoki
opened
3 months ago
0
SYCL-Bench 2020: Benchmarking SYCL 2020 on AMD, Intel, and NVIDIA GPUs
#38
yukarinoki
opened
4 months ago
0
Smoothing the Migration from CUDA to SYCL: SYCLomatic Utility Features
#37
yukarinoki
opened
4 months ago
0
RTGPU: Real-Time GPU Scheduling of Hard Deadline Parallel Tasks with Fine-Grain Utilization
#35
yukarinoki
opened
11 months ago
2
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
#34
yukarinoki
opened
1 year ago
1
Efficient Execution of OpenMP on GPUs
#33
yukarinoki
opened
1 year ago
1
Direct GPU Compilation and Execution for Host Applications with OpenMP Parallelism
#31
yukarinoki
opened
1 year ago
0
Just-in-Time Compilation and Link-Time Optimization for OpenMP Target Offloading
#30
yukarinoki
opened
1 year ago
0
GPU First — Execution of Legacy CPU Codes on GPUs
#29
yukarinoki
opened
1 year ago
6
Multi-Objective Task Assignment and Multiagent Planning with Hybrid GPU-CPU Acceleration ?
#28
yukarinoki
closed
1 year ago
2
ACCELERATING PARALLEL TASKS BY OPTIMIZING GPU HARDWARE RESOURCE UTILIZATION
#27
yukarinoki
opened
1 year ago
0
Dynamic GPU Scheduling with Multi-resource Awareness and Live Migration Support(2023)
#26
yukarinoki
opened
1 year ago
1
Pagoda: A GPU Runtime System for Narrow Tasks (2019)
#25
yukarinoki
opened
1 year ago
2
Task management for irregular-parallel workloads on the GPU (2010)
#24
yukarinoki
opened
1 year ago
0
Modeling GPU Dynamic Parallelism for Self Similar Density Workloads
#23
yukarinoki
opened
1 year ago
0
A Versatile Software Systolic Execution Model for GPU Memory-Bound Kernels
#21
yukarinoki
closed
1 year ago
0
Graphene: An IR for Optimized Tensor Computations on GPUs
#20
yukarinoki
closed
1 year ago
0
ACC Saturator: Automatic Kernel Optimization for Directive-Based GPU Code
#19
yukarinoki
closed
1 year ago
2
Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs.
#17
yukarinoki
closed
1 year ago
3
AStitch: enabling a new multi-dimensional optimization space for memory-intensive ML training and inference on modern SIMT architectures
#16
yukarinoki
opened
1 year ago
2
Apollo: Automatic partition-based operator fusion through layer by layer optimization
#15
yukarinoki
opened
1 year ago
3
A Fast Post-Training Pruning Framework for Transformers
#14
yukarinoki
closed
1 year ago
0
RAMMER: enabling holistic deep learning compiler optimizations with rtasks
#13
yukarinoki
opened
1 year ago
0
Parallel Implementations of Candidate Solution Evaluation Algorithm for N-Queens Problem
#12
yukarinoki
opened
1 year ago
0
Powering Multi-Task Federated Learning with Competitive GPU Resource Sharing
#11
yukarinoki
closed
1 year ago
5
Multi-GPU thermal lattice Boltzmann simulations using OpenACC and MPI
#10
yukarinoki
closed
1 year ago
0
From Task-Based GPU Work Aggregation to Stellar Mergers: Turning Fine-Grained CPU Tasks into Portable GPU Kernels
#8
yukarinoki
opened
1 year ago
3
Microsecond-scale preemption for concurrent {GPU-accelerated}{DNN} inferences
#7
yukarinoki
opened
1 year ago
3
Modeling GPU Dynamic Parallelism for self similar density workloads
#6
yukarinoki
opened
1 year ago
1
A Compiler Framework for Optimizing Dynamic Parallelism on GPUs
#5
yukarinoki
opened
1 year ago
7
GPUIterator: bridging the gap between Chapel and GPU platforms
#4
yukarinoki
opened
1 year ago
2
HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines
#3
yukarinoki
opened
1 year ago
2
GPUクラスタにおけるGPU間セルフ通信機構
#2
yukarinoki
opened
1 year ago
5
Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning
#1
yukarinoki
opened
1 year ago
9