Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

LLVM-IR for coroutines not properly lowered by opt #49629

Open Quuxplusone opened 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR50660
Status NEW
Importance P enhancement
Reported by Christos Perivolaropoulos (C.Perivol@ed.ac.uk)
Reported on 2021-06-10 09:13:49 -0700
Last modified on 2021-07-13 09:12:32 -0700
Version trunk
Hardware PC All
CC dblaikie@gmail.com, llvm-bugs@lists.llvm.org, yedeng.yd@linux.alibaba.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

I am trying to understand how llvm lowers coroutines and I think I encountered a bug in the LLVM compiler. I created a small c program called llvm_intrinsics.c that calls llvm builtins:

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>

void* f(int n) {
  __builtin_coro_id(0, 0, 0, 0);
  int8_t* hdl = __builtin_coro_begin((int*)malloc(__builtin_coro_size()));
  for (;;) {
    printf("%d\n",++n);
    switch (__builtin_coro_suspend(0)) {
      case 0:
        continue;
      case 1:
        goto CLEANUP;
      default:
        goto SUSPEND;
    }
  }
CLEANUP:
  free(__builtin_coro_free(hdl));
SUSPEND:
  __builtin_coro_end(hdl, 0);
  return hdl;
}

int main() {
  void* hdl = f(4);
  __builtin_coro_resume(hdl);
  __builtin_coro_resume(hdl);
  __builtin_coro_destroy(hdl);
  return 0;
}

This should (and upon inspection does indeed) generate LLVM-IR code that is almost exactly the example in the LLVM documentation [1]. Clang when building C does not run the llvm transformation passes at all, so one needs to run opt by hand. This is unfortunate generally but since the purposes of this exercise are pedagogic it is not a problem. Here is a build script that builds the C file also emitting the intermediate LLVM-IR files.

#!/usr/bin/env bash
set -e

echo "Compiling into LLVM IR"
clang -fcoroutines-ts  -emit-llvm -S llvm_intrinsics.c -o llvm_intrinsics.ll

# echo "Early.."
if [ "$1" == "--holistic-opt" ]; then
    echo "Running all coroutine passes"
    opt -coro-early -coro-split -coro-elide -coro-cleanup llvm_intrinsics.ll -S -o llvm_intrinsics_clean.ll
else
    echo "Running each pass"
    echo "Early.."
    opt -coro-early llvm_intrinsics.ll -S -o llvm_intrinsics_early.ll
    echo "Split.."
    opt -coro-split llvm_intrinsics_early.ll -S -o llvm_intrinsics_split.ll
    echo "Elide.."
    opt -coro-elide llvm_intrinsics_split.ll -S -o llvm_intrinsics_elide.ll
    echo "Cleanup.."
    opt -coro-cleanup llvm_intrinsics_elide.ll -S -o llvm_intrinsics_clean.ll
fi

echo "Compiling LLVM"
clang -lm llvm_intrinsics_clean.ll -o llvm_intrinsics
./llvm_intrinsics

The result I expect when using this script to build and run llvm_intrinsics.c is a) a successful build regardless of the argument provided to the script and b) an executable that only prints 4\n5\n. However here is what it does:

$ ./build.sh
Compiling into LLVM IR
Running each pass
Early..
Split..
Elide..
Cleanup..
Compiling LLVM
fatal error: error in backend: Cannot select: intrinsic %llvm.coro.size
clang-10: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 10.0.1 
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /nix/store/6pzqj9q656vc1msa675k75hmhsrfizsy-clang-10.0.1/bin
clang-10: note: diagnostic msg: PLEASE submit a bug report to  and include the crash backtrace, preprocessed source, and associated run script.
clang-10: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs.

$ ./build.sh --holistic-opt | head
Compiling into LLVM IR
Running all coroutine passes
Compiling LLVM
5
6
7
8
9
10
11

To put it in words opt fails to run the passes separately and when it runs them together it seems to run them but yields a wrong result. I also tried LLVM 12 and had the same results. Since C++ coroutines work fine I assume this has to do with something I have misunderstood? Is there another way other than opt to run the passes?

This report is adapted from an issue I opened in stackoverflow [2]

[1] https://llvm.org/docs/Coroutines.html#coroutine-representation [2] https://stackoverflow.com/questions/67923082/llvm-ir-for-coroutines-not-properly-lowered-by-opt?noredirect=1#comment120055691_67923082

Quuxplusone commented 3 years ago

Hi,

I didn't reproduce the results. Although I rename 'llvm_intrinsics.c' to 'llvm_intrinsics.cpp' and use clang++ to compile it, do you think it matters?

The compiler I used is the trunk. What's the version you used?

By the way, this sentence seems not correct:

Clang when building C does not run the llvm transformation passes at all

If it builds a cpp program, it would run the transformation passes. We need to add option '-Xclang -disable-llvm-passes' to disable them.

Quuxplusone commented 3 years ago

Well I see what you are saying, my semi- hypothetical end goal was to see how I could build a language that uses the coroutine infrastructure or instrument the latter to an existing language that targets llvm (eg rust). So I would like to put together the fundamentals rather than pretending that I am using c++. Do you think that this is beyond the scope of llvm or even clang?

Quuxplusone commented 3 years ago
(In reply to Christos Perivolaropoulos from comment #2)
> Well I see what you are saying, my semi- hypothetical end goal was to see
> how I could build a language that uses the coroutine infrastructure or
> instrument the latter to an existing language that targets llvm (eg rust).
> So I would like to put together the fundamentals rather than pretending that
> I am using c++. Do you think that this is beyond the scope of llvm or even
> clang?

I think it isn't beyond the scope for llvm but beyond the scope of clang. I
mean that it may and should be possible to use the coroutine infrastructure for
other languages. But we need to write new front end to support it. See
clang/lib/CodeGen/CreatePasses.cpp for details.

By the way, it is possible that the existing pipeline for coroutine has bugs.
Let me know if you feel not comfortable.
Quuxplusone commented 3 years ago

Thanks for the reply. I doubt that it's an bug related issue that I am facing because this c program successfully produces the llvm code shown in the documentation. To me this means that at least clang works as expected and the issue lies with the llvm compiler. In any case, there must be a way to compile this code without resorting to c++ even if it is quirky and does not align with the documentation. What I would ideally want is someone who is comfortable with the passes infrastructure to a) help me get this c code to run and b) pinpoint where the problem is so we can get llvm to behave in a predictable way (or document it).

PS. I can't find the path that you mention. Which branch are you referring to?

Quuxplusone commented 3 years ago
(In reply to Christos Perivolaropoulos from comment #4)
> Thanks for the reply. I doubt that it's an bug related issue that I am
> facing because this c program successfully produces the llvm code shown in
> the documentation. To me this means that at least clang works as expected
> and the issue lies with the llvm compiler. In any case, there must be a way
> to compile this code without resorting to c++ even if it is quirky and does
> not align with the documentation. What I would ideally want is someone who
> is comfortable with the passes infrastructure to a) help me get this c code
> to run and b) pinpoint where the problem is so we can get llvm to behave in
> a predictable way (or document it).
>
> PS. I can't find the path that you mention. Which branch are you referring
> to?

Sorry, the path is `clang/lib/CodeGen/BackendUtil.cpp/CreatePasses`.

Your words make sense to me.  There is issue report related to coroutine
pipeline in the mailing list too. I would try to look it.

BTW, if you want to support it for other languages like rust, I think it may be
better to add passes in the frontend.
Quuxplusone commented 3 years ago

After patching up: https://reviews.llvm.org/D105877, the example would fail still due to un-lowered __builtin_coro_* builtins. I think it is by design. To use coroutine for other languages, we should use the coroutine intrinsics (https://llvm.org/docs/Coroutines.html#llvm-coro-end-intrinsic).

Although I understand that it should be natural that the builtin_* intrinsics be compatible for C and C++, I think it is not a bug. (It should be easy to fix if we treat it as bug).

If you are fine with this, I would mark this as fixed if the patch could be checked-in.

Quuxplusone commented 3 years ago
Thanks a lot for taking the time to look at this in detail I appreciate it.

 I am slightly confused, I understand that the proper way for a language to use coroutines would be to emit llvm code that uses the intrinsics you link to, but I don't really get how that is different to what the c code in question does via the builtin_coro_* builtin.

Let me put my confusion another way: forget the c code, say we hand crafted
llvm code that uses coroutines in accordance to the tutorial you linked. How
would one go about a) lowering that to non-coroutine code by running the right
passes, b) compiling it to machine code?
Quuxplusone commented 3 years ago

(In reply to Christos Perivolaropoulos from comment #7)

Thanks a lot for taking the time to look at this in detail I appreciate it.

I am slightly confused, I understand that the proper way for a language to use coroutines would be to emit llvm code that uses the intrinsics you link to, but I don't really get how that is different to what the c code in question does via the builtincoro* builtin.

Let me put my confusion another way: forget the c code, say we hand crafted llvm code that uses coroutines in accordance to the tutorial you linked. How would one go about a) lowering that to non-coroutine code by running the right passes, b) compiling it to machine code?

This is what https://reviews.llvm.org/D105877 trying to solve. After patching this up, it should be easy to compile .ll or .bc files by clang or opt.

but I don't really get how that is different to what the c code in question does via the builtincoro* builtin.

In implementation for builtins, it treat different builtins different for C_LANG, CXX_LANG and other languages. It is natural since C and C++ are two different languages. The point here is that builtin are language dependent.

BTW, it looks like a bug indeed that builtin_coro* can't be used by clang (not clang++) with fcoroutines-ts.

Quuxplusone commented 3 years ago

Ok, it's hard for me to build LLVM right now, maybe I will try it out during the weekend. Thanks again.