Open Quuxplusone opened 9 years ago
Attached bad.ll
(16808 bytes, application/octet-stream): Bad IR
Attached bad.instcombine.ll
(15426 bytes, application/octet-stream): Bad IR after InstCombine
Some additional context. Internally, if we see a loop that has a known iteration count, and the body of the loop could be simplified by complete unrolling, we increase the threshold to make the loop more likely to be unrolled. Unlike the community branch, we double the threshold on the second call to the LoopUnrollPass, which is exposing the issue.
One of our internal developers has proposed caching the chain results.
Does the below look like a reasonable solution? If so, I'll upload a complete
patch and
associated test cases.
----------------------------------------------------------------------------------------
diff --git a/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
b/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index a610e57..5f43296 100755
--- a/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -1980,6 +1980,8 @@ enum ChainResult {
CR_LeadsToInteriorNode
};
+typedef DenseMap<const SDNode *, ChainResult> TFCacheMapT;
+
/// WalkChainUsers - Walk down the users of the specified chained node that is
/// part of the pattern we're matching, looking at all of the users we find.
/// This determines whether something is an interior node, whether we have a
@@ -1992,7 +1994,8 @@ enum ChainResult {
static ChainResult
WalkChainUsers(const SDNode *ChainedNode,
SmallVectorImpl<SDNode*> &ChainedNodesInPattern,
- SmallVectorImpl<SDNode*> &InteriorChainedNodes) {
+ SmallVectorImpl<SDNode*> &InteriorChainedNodes,
+ TFCacheMapT &TFCacheMap) {
ChainResult Result = CR_Simple;
for (SDNode::use_iterator UI = ChainedNode->use_begin(),
@@ -2073,7 +2076,16 @@ WalkChainUsers(const SDNode *ChainedNode,
// as a new TokenFactor.
//
// To distinguish these two cases, do a recursive walk down the uses.
- switch (WalkChainUsers(User, ChainedNodesInPattern, InteriorChainedNodes))
{
+
+ ChainResult UserResult;
+ if (TFCacheMap.count(User)) {
+ UserResult = TFCacheMap[User];
+ } else {
+ UserResult = WalkChainUsers(User, ChainedNodesInPattern,
+ InteriorChainedNodes, TFCacheMap);
+ TFCacheMap[User] = UserResult;
+ }
+ switch (UserResult) {
case CR_Simple:
// If the uses of the TokenFactor are just already-selected nodes, ignore
// it, it is "below" our pattern.
@@ -2114,9 +2126,10 @@ HandleMergeInputChains(SmallVectorImpl<SDNode*>
&ChainNodesMatched,
// users of the chain result. This adds any TokenFactor nodes that are caught
// in between chained nodes to the chained and interior nodes list.
SmallVector<SDNode*, 3> InteriorChainedNodes;
+ TFCacheMapT TFCacheMap;
for (unsigned i = 0, e = ChainNodesMatched.size(); i != e; ++i) {
if (WalkChainUsers(ChainNodesMatched[i], ChainNodesMatched,
- InteriorChainedNodes) == CR_InducesCycle)
+ InteriorChainedNodes, TFCacheMap) == CR_InducesCycle)
return SDValue(); // Would induce a cycle.
}
bad.ll
(16808 bytes, application/octet-stream)bad.instcombine.ll
(15426 bytes, application/octet-stream)