JuliaGPU / GPUCompiler.jl

Reusable compiler infrastructure for Julia GPU backends.
Other
150 stars 45 forks source link

Calling device function with CC from kernel with CC results in trap #97

Open maleadt opened 3 years ago

maleadt commented 3 years ago
; ModuleID = 'reduce.bc'
source_filename = "reduce.ll"

define ptx_device void @child() {
  ret void
}

define ptx_kernel void @parent() {
  call void @child()
  ret void
}

Gets optimized to a trap: instcombine makes it a store i1 true, i1* undef, align 1 after which simplifycfg reduces it to trap; unsure why.

maleadt commented 3 years ago

Same happens with SPIR-V.

maleadt commented 3 years ago
    // If the call and callee calling conventions don't match, this call must
    // be unreachable, as the call is undefined.
    if (CalleeF->getCallingConv() != Call.getCallingConv() &&
        // Only do this for calls to a function with a body.  A prototype may
        // not actually end up matching the implementation's calling conv for a
        // variety of reasons (e.g. it may be written in assembly).
        !CalleeF->isDeclaration()) {
      Instruction *OldCall = &Call;
      CreateNonTerminatorUnreachable(OldCall);
      // If OldCall does not return void then replaceAllUsesWith undef.
      // This allows ValueHandlers and custom metadata to adjust itself.
      if (!OldCall->getType()->isVoidTy())
        replaceInstUsesWith(*OldCall, UndefValue::get(OldCall->getType()));
      if (isa<CallInst>(OldCall))
        return eraseInstFromFunction(*OldCall);

Hah, ok. What's the use of these then.