Open wenxcs opened 3 years ago
Annotation Legend:
(Output tensor to use easiest way to allocate)
In this section CG will represent our new code generator
Graph Rewriting Feature @wenxcs ? weeks |
Enhance user experience and support for ONNX format @jlxue | 1week
Unify kernel provider interface for Antares, KernelDB, and our internal kernels @wenxcs 1 weeks |
[ ] Element wise operator performance @xysmlx
[ ] Constant folding performance @yiyione (TBD) -[ ] Profiler performance Kernel Performance:
[Low priority] Serialization and deserialization interface
[ ] docker image support low version of cuda driver @wenxcs 3 days |
[ ] Investigation on GNN training @xysmlx
[ ] Merge Gamefusion: reduce fusion @yiyione
[ ] Benchmark Page @wenxcs 3 days |
Quant&Sparity Interface
CPU performance improvement
Android code-gen&tuning
Control-flow support |@xysmlx
new profiler: support different profiling information, e.g., time, kernel time provided by nvprof | @xysmlx
Profiling cache DB | @xysmlx
BlockFusion | @xysmlx
Release Plan for V0.4
Release Note:
Annotation Legend:
Feature
Python API
Customized operator [ziming]
Function codegen
(Output tensor to use easiest way to allocate)
Other
[Not included] Code generator
In this section CG will represent our new code generator
Tensor Program level API
Schedule
Stretch
Mechanism
Graph Rewriting Feature @wenxcs ? weeks |
Enhance user experience and support for ONNX format @jlxue | 1week
Unify kernel provider interface for Antares, KernelDB, and our internal kernels @wenxcs 1 weeks |
Refactor/Improvement
[ ] Element wise operator performance @xysmlx
[ ] Constant folding performance @yiyione (TBD) -[ ] Profiler performance Kernel Performance:
[Low priority] Serialization and deserialization interface
[ ] docker image support low version of cuda driver @wenxcs 3 days |
[ ] Investigation on GNN training @xysmlx
[ ] Merge Gamefusion: reduce fusion @yiyione
[ ] Benchmark Page @wenxcs 3 days |
Quant&Sparity Interface
CPU performance improvement
Android code-gen&tuning
Control-flow support |@xysmlx
new profiler: support different profiling information, e.g., time, kernel time provided by nvprof | @xysmlx
Profiling cache DB | @xysmlx
BlockFusion | @xysmlx