issues
search
tlc-pack
/
tvm-tensorir
Apache License 2.0
8
stars
0
forks
source link
[Roadmap] AutoTensorIR
#69
Closed
junrushao
closed
3 years ago
junrushao
commented
4 years ago
CPU Rules
[x] Multi-level-tiling
[x] Multi-level-tiling-with-fusion
[x] Factorize-reduction
[ ] Simplify-compute-with-const-tensor
[x] Add-cache-write
[x] Always-inline
[x] Parallelize outer, vectorize inner
GPU Rules
[x] Multi-level-tiling
[x] Multi-level-tiling-with-fusion
[x] Cross-thread-reduction
[x] Simplify compute with const tensor
[ ] Special-compute-location
[x] Add-cache-write
[x] Add-cache-read
[x] Always-inline
Init rules
[x] Init-parallel
[x] Init-vectorize
[x] Init-fill-tile-size
[x] Change-compute-location
[x] Init-thread-bind
[x] Init-unroll
Mutation rules:
[x] Mutate tile size
[x] Mutate compute location
[x] Mutate parallel
[x] Mutate max unroll factor
Cost model, features, search policy
[x] Random
[x] XGBModel
[x] Feature extraction
[x] Evolutionary search
[x] Random search
[x] Serialization
Misc
[x] Sampler re-design, reproducibility
[x] Re-design mutation
[x] Tiling size factorization
[x] Evolutionary search API re-design
[x] Merge builder and runner in AutoTune API
[x] Analysis module in TIR
[x] Search with stages
[x] Evolutionary search mixup, don’t have to sample random
[x] RPC format
[x] Organize workloads into workload.py
[x] Instruction attrs
[x] Buffer
[x] AnnotateLoopType => MarkLoop, AnnotateBlockType => MarkBlock, loop_type => mark
[x] Measure Serialization / Deserialization
[x] Use TE instead of TIR workloads
[x] Fix sample_fusible_loops
[x] FLOPs counter
[x] Refactor
SampleFusibleLoops
and bug fix
[x] PrintAsPythonAPI
[x] Remove useless TContextInfo in search rules
[x] Exception handling
[ ] Rename blocks
[ ] RenameVars
[ ] RemoveUnitLoop
[x] Use meta schedule primitives in postproc
junrushao
commented
3 years ago
It's done
CPU Rules
GPU Rules
Init rules
Mutation rules:
Cost model, features, search policy
Misc
SampleFusibleLoops
and bug fix