llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
26.82k stars 10.99k forks source link

JIT always emits far calls #13888

Open nunoplopes opened 11 years ago

nunoplopes commented 11 years ago
Bugzilla Link 13516
Version trunk
OS All
CC @asl

Extended Description

I got this e-mail from Tim Starling (reproduced with authorization):

" I think that the problem is probably exhaustion of the branch target buffer. In x86-64 with CodeModel::Large, every call becomes register-indirect, like:

0x400dc10e: movabs $0x40090bb0,%rax 0x400dc118: mov %r14,%rdi 0x400dc11b: mov %rbx,%rsi 0x400dc11e: callq *%rax

Based on my reading of the Intel optimization reference manual, each such call site will use up a slot in the branch target buffer. Intel doesn't include the size of it on their spec sheets or in the optimization manual, but other sources say that it has 512 entries, except for a few very recent processors which have 1024 entries.

I tried using CodeModel::Small, but it just caused an assert error when it encountered calls to functions outside of RIP+2GB, instead of upgrading them to register-indirect calls:

php: X86CodeEmitter.cpp:477: void::Emitter::emitMemModRMByte(const llvm::MachineInstr&, unsigned int, unsigned int, intptr_t) [with CodeEmitter = llvm::JITCodeEmitter]: Assertion `IndexReg.getReg() == 0 && Is64BitMode && "Invalid rip-relative address"' failed. Stack dump:

  1. Running pass 'X86 Machine Code Emitter' on function '@ZEND_CAST_SPEC_CONST_HANDLER'

Apparently there is no support in LLVM for some calls being short and some being long. When code is compiled with clang or llc, it can use CodeModel::Small and rely on the fact that all calls to code outside of the 2GB neighbourhood will be via the PLT.

CodeModel::JITDefault is apparently a hack to work around the lack of awareness of the RIP address in X86DAGToDAGISel. Any LLVM JIT will have the same performance issue when more than 512 call instructions appear in a loop. "

llvmbot commented 6 years ago

All of IR calling codes are using by BaseBlock object. you don't need to make jumps for IR unless you are not using JIT (you will need it when you using MC instruction builder)

[FIX] All of IR calling codes are using by BaseBlock object. -> All of IR call instructions are creating by using BaseBlock. not native address.

llvmbot commented 6 years ago

All of IR calling codes are using by BaseBlock object. you don't need to make jumps for IR unless you are not using JIT (you will need it when you using MC instruction builder)

llvmbot commented 6 years ago

doesn't it that using 64bit architecture have to use Far Absolute for JITs? they even don't have IATs tho.

using short or indirect jumps are usally uses for local jump. like condition or jmp to IAT.

JITs dosn't have any top level IAT to indirect. so using long or abololutes are better for my opinion.