Closed ArtBlnd closed 3 years ago
@dotnet/jit-contrib
IMO we should not split design of auto vectorization into small parts which will be worked on independently like it is proposed here, in contrary, we should rather consider working on more general compiler design which architecturally would be capable to support mutliple existing auto vectorization algorithms/solutions as a compilation tiers. Therefore this issue could be rather one of multitude of points which are important in implementing such feature and discussion should start at the very beginning of the design process or proposal by requesting auto vectorization optimization support in RyuJIT.
IMO we should not split design of auto vectorization into small parts which will be worked on independently like it is proposed here, in contrary, we should rather consider working on more general compiler design which architecturally would be capable to support mutliple existing auto vectorization algorithms/solutions as a compilation tiers. Therefore this issue could be rather one of multitude of points which are important in implementing such feature and discussion should start at the very beginning of the design process or proposal by requesting auto vectorization optimization support in RyuJIT.
Yes, I agree. auto vectorization should not split into small optimization phases. but I am not sure that auto vectorization to be separated from loop optimization phase. I agree with it should not be split design. but still seems it is kind of loop optimization. and remember that implementing duplication of varaibles outside of loop unrolling is inefficient. and so vectorization phase should process after processing all loop optimization phase.
I would agree that we need a holistic design before embarking on bits and pieces of support for auto-vectorization. @ArtBlnd if you are agreeable, I would suggest that you change the title of this issue to something like "Add JIT support for auto-vectorization" and then we can use this as a meta-issue to track the discussion of requirements, as well as potentially breaking out individual issues that contribute to the overall design of the JIT optimizations, as well as to auto-vectorization. For example, before we contemplate auto-vectorization, we need to have improved loop analysis.
All of that being said, this is also closely tied to future plans for multi-tiered jitting. It may be that we decide to introduce a non-RyuJIT-based higher-tier JIT that incorporates optimizations such as auto-vectorization. I'm not announcing such a thing, just pointing out that there are a host of high-level architectural design choices that will feed into where and how we support auto-vectorization and other higher-cost optimizations in future.
Now that HWIntrinsics are supported, this would be much more feasible to do on x86/x64. The simplest form of vectorization is really just a form of loop unrolling after all 😄
@tannergooding I understand what you talking about. so I am just waiting that loop unrolling PR to be merged.
after its merged. I'll implement it after loop optimization phase.
Maybe, someday you (NET Team) can create an AI inside JIT that takes raw C# or IL code as input and generates the fastest asm code. If a developer want to try it, he or she can add an attribute [AIOptimization] etc and will know that it can gerenate wrong code that does not work. So after he or she tests it that it works correctly, he or she can start using it. Just an idea, wanted to share it. How does it sound? Impossible? Or interesting?
@faruknane
@svick you will see what tomorrow brings. I believe it's applicable.
IMHO @faruknane is on point here. Teaching AI strict grammars by training them on different code repos (millions of them are available here on the GitHub) results in robust transformation models which can handle any programming language correctly - AI does it better than human for formal languages while it is the reverse for natural languages. Coding AI models and tasks is a constant topic on a multitude of conferences and a constrained transformation model that has to be used for auto-vectorization tasks fits with current AI research directions.
In contrast to some popular opinions IMO the auto-vectorization task is a perfect fit for AI - the real hurdle though is to bridge compiler technology with AI.
@RussKeldorph
I am surprised it may work too ...
For the starter see: The Unreasonable Effectiveness of Recurrent Neural Networks by Andrej Karpathy from 2015 (I know it's AI paleontology already)
Then there is something from Oracle for the main course: Code Generation using LSTM (Long Short-term memory) RNN network by Meena Vyas from November 2018
And some dessert: If AI Is Already Writing Code, Will Programmers Lose Their Jobs?
Jokes aside I have been hitting articles and conference papers dealing with AI doing software developer work quite often but do not have citations right at hand. The search returns a lot of data.
How about this one: A Survey of Machine Learning for Big Code and Naturalness paper by Miltiadis Allamanis from May 2018
This article has the best title I have seen so far: Machine Learning in Compiler Optimisation, Zheng Wang, Michael O'Boyle, arXiv, Wed, 9 May 2018
There is a lot going on code generation. Some teams in Microsoft are also doing research on this topic and sharing the results in the research channel of Microsoft on YouTube.
Exactly, Miltiadis Allamanis is at Microsoft Research Cambridge UK
This design discussion has played out, for now. The JIT team would love to spend more time thinking about auto-vectorization, but it is unlikely to happen in the near or even mid-term future. And its impact, while potentially great, though very narrow, can also be somewhat achieved using the now-existing hardware intrinsics. So I'm going to close this.
Current RyuJIT does not support local variable duplication based on super-scalar. or auto vectorization. If you have is any discussion or informations. Please feel free to write here.
To-do Lists.
Vector<T>
first. and also for unrolled normal ALUs after.category:design theme:optimization skill-level:expert cost:extra-large