Closed junrushao closed 1 month ago
@junrushao1994 ,
In looking for auto-tensorization ability of TVM (to explore search for accelerators designs & custom ISA) permit me to ask:
Auto Tensorization
removed form this list (was at section [M4b] if I recall), what was/is the plan with ?Thank You !
Hey @cbalint13 thanks for asking! Absolutely!
Was Auto Tensorization removed form this list (was at section [M4b] if I recall), what was/is the plan with ?
The only reason is that I'm trying to organize the roadmap. Auto tensorization is a huge item and we want to have a separate tracking issue for it. As you already see, we have been upstreaming auto tensorization-related PRs, including #9871 #10066. My branch also contains auto tensorization-related working examples if you want to try them out now :-)
Also regarding of design plan, will/have something in common with principles of https://arxiv.org/abs/2101.08458?
This work is done by my fellow colleagues, and of course we are aware, and we have a lot in common :-) Their codebase is public here. The difference here is that we are now using TensorIR, a more powerful and systematic IR/scheduling system to support tensorization
Hey @cbalint13 thanks for asking! Absolutely!
@junrushao1994
First, thanks a lot for your time !
Was Auto Tensorization removed form this list (was at section [M4b] if I recall), what was/is the plan with ?
The only reason is that I'm trying to organize the roadmap. Auto tensorization is a huge item and we want to have a separate tracking issue for it. As you already see, we have been upstreaming auto tensorization-related PRs, including #9871 #10066. My branch also contains auto tensorization-related working examples if you want to try them out now :-)
Also regarding of design plan, will/have something in common with principles of https://arxiv.org/abs/2101.08458?
This work is done by my fellow colleagues, and of course we are aware, and we have a lot in common :-) Their codebase is public here. The difference here is that we are now using TensorIR, a more powerful and systematic IR/scheduling system to support tensorization
Can't wait to try it, will look into mentioned WiP early branch.
Many thanks again !
Thank you @cbalint13 for your kind response! We are super excited to hear about your work and more than happy to assist/collaborate on TensorIR/MetaSchedule!
Would be good to get a status update @junrushao1994 . I would suggest we move followup non-infra part to separate trackings to keep things tracable.
This is a global tracking issue for landing the meta schedule. The RFC can be found here.
Steps
The steps are numbered following TensorIR (#7527).
[M3a] Core infrastructure
[M3b] Enable measurement
[M3c] Enhance search
[M4a] Performance & Coverage
Schedule Rules
PostProcessors
Mutators
User interface
Misc
[M4b] Relay integration
M5. Operator coverage with all backends for auto tensorization
Being able to tensorize on all the backends
TileWithTensorIntrin
#11050 #11075M6. Memory optimization
Important for CUDA performance, not CPU. Not related to functionality.
M7. Unblock end-to-end experiments
M8. Broader Set of Intrinsics and Optimization