swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.59k stars 10.37k forks source link

StringSwitch in Sema/TypeCheckAttr.cpp creates pathologically large code #76387

Open rnk opened 2 months ago

rnk commented 2 months ago

Description

Clang is having issues compiling this line of C++ code in the Swift compiler. The code in question does this:

  return llvm::StringSwitch<bool>(symbol)
#define FUNCTION(_, Name, ...) \
    .Case(#Name, false) \
    .Case("_" #Name, false) \
    .Case(#Name "_", false) \
    .Case("_" #Name "_", false)
#include "swift/Runtime/RuntimeFunctions.def"
    .Default(true);

This is just inefficient C++. There are 278 hits for FUNCTION in RuntimeFunctions.def, so this creates 1112 .Case method calls. This generates a large CFG which is expensive for the compiler to optimize, and is ultimately not optimal code, it is notionally a sequence of if (S == "asdf") return false;.

For compilation efficiency and runtime efficiency, please consider representing the list of runtime functions as data, like with a StringSet of names that you can test against.

We may consider filing an additional bug against clang if we can produce a reproducer for Clang, but I wanted to file this in Swift first because there is an opportunity to make the Swift compiler more efficient.

cc @compnerd for amusement, as the person I probably know best who works on Swift.

Reproduction

Touch the TypeCheckAttr.cpp C++ file, rebuild Swift, observe how long it takes to compile. Rebuild that file with -ftime-trace and observe how much time is spent compiling this StringSwitch.

Expected behavior

This file should compiler faster and with smaller object code.

Environment

appears present at head

Additional information

No response

rnk commented 2 months ago

Debugging a little bit further, it appears this code pattern can lead to compiler stack overflow, because these repeated .Case method invocations are nested expressions. Here is a sample of the stack trace, which goes 3440 frames deep:

#3399 0x000055555a9cb45c in clang::CodeGen::CodeGenFunction::EmitCXXMemberOrOperatorMemberCallExpr(clang::CallExpr const*, clang::CXXMethodDecl const*, clang::CodeGen::ReturnValueSlot, bool, clang::NestedNameSpecifier*, bool, clang::Expr const*) ()
#3400 0x000055555a993c31 in clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ()
#3401 0x000055555a99acd1 in clang::CodeGen::CodeGenFunction::EmitLValueHelper(clang::Expr const*, clang::CodeGen::KnownNonNull_t) ()
#3402 0x000055555a9cb45c in clang::CodeGen::CodeGenFunction::EmitCXXMemberOrOperatorMemberCallExpr(clang::CallExpr const*, clang::CXXMethodDecl const*, clang::CodeGen::ReturnValueSlot, bool, clang::NestedNameSpecifier*, bool, clang::Expr const*) ()
#3403 0x000055555a993c31 in clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ()
#3404 0x000055555a99acd1 in clang::CodeGen::CodeGenFunction::EmitLValueHelper(clang::Expr const*, clang::CodeGen::KnownNonNull_t) ()
#3405 0x000055555a9cb45c in clang::CodeGen::CodeGenFunction::EmitCXXMemberOrOperatorMemberCallExpr(clang::CallExpr const*, clang::CXXMethodDecl const*, clang::CodeGen::ReturnValueSlot, bool, clang::NestedNameSpecifier*, bool, clang::Expr const*) ()
#3406 0x000055555a993c31 in clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ()
#3407 0x000055555a99acd1 in clang::CodeGen::CodeGenFunction::EmitLValueHelper(clang::Expr const*, clang::CodeGen::KnownNonNull_t) ()
#3408 0x000055555a9cb45c in clang::CodeGen::CodeGenFunction::EmitCXXMemberOrOperatorMemberCallExpr(clang::CallExpr const*, clang::CXXMethodDecl const*, clang::CodeGen::ReturnValueSlot, bool, clang::NestedNameSpecifier*, bool, clang::Expr const*) ()
#3409 0x000055555a993c31 in clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ()
#3410 0x000055555a99acd1 in clang::CodeGen::CodeGenFunction::EmitLValueHelper(clang::Expr const*, clang::CodeGen::KnownNonNull_t) ()

From the Swift PoV, this is just a slow (sometimes stack-overflow-ing) compile step, but I'd recommend using a simple hash set for this code pattern anyway.

cc @allevato @ilya-biryukov