compor / llvm-ir-cmake-utils

LLVM IR CMake utils for bitcode file manipulation by opt and friends
GNU Lesser General Public License v2.1
68 stars 14 forks source link

Custom commands fail with certain generator expressions #18

Closed jvstech closed 1 year ago

jvstech commented 3 years ago

Given this particular line:

https://github.com/compor/llvm-ir-cmake-utils/blob/6a956a411ba157f2ceccf07a85f744eb2455e8c3/cmake/LLVMIRUtil.cmake#L141

If any of the target properties that compose CMD_ARGS contain command-specific generator expressions, CMake will fail.

For example, attaching a bitcode target to a library or executable containing $<OR:$<COMPILE_LANGUAGE:C>,$<COMPILE_LANGUAGE:CXX>>:-UNDEBUG> will cause CMake to fail with the following error:

CMake Error at src/cmake/LLVMIRUtil.cmake:139 (add_custom_command):
  Error evaluating generator expression:

    $<COMPILE_LANGUAGE:C>

  $<COMPILE_LANGUAGE:...> may only be used to specify include directories,
  compile definitions, compile options, and to evaluate components of the
  file(GENERATE) command.

Two possible solutions involve stripping out these types of generator expressions using string(GENEX_STRIP) (NOT ideal) or using file(GENERATE) to write all the arguments to a response file for Clang to consume. This is also less-than-ideal as this response file won't exist until build time so it can't be used for testing compilation flags, but it's better than removing the generator expressions entirely.

compor commented 3 years ago

Yes, there is no provision for generator expressions when in target properties ATM.

I definitely don't like the stripping solution, because I'm not sure what that would leave the bitcode target after that. I'll have to think about the other suggestion with file(GENERATE).

My capacity to work on this is fairly limited ATM. I'd welcome a PR, but I'd like to research the best way forward first.

jvstech commented 3 years ago

No worries. When I get some free time myself, I'll see what I can come up with. It may be possible to use add_library(OBJECT) instead, but I remember there were some limitations with this in the past that may make it a non-starter. Either way, I'll help out however I can.

compor commented 3 years ago

Hello again @jvstech ,

Do you have a minimal CMakeLists.txt that reproduces that error?

jvstech commented 3 years ago

Hey @compor. I do not have a minimal CMakeLists.txt available on-hand. However, I ran into this problem while trying to use llvmir_attach_bc_target on the Compiler-RT builtins project. (I apologize if that isn't helpful.) When I get home from work, I'll attach a minimal example.

compor commented 3 years ago

If you mean this part of the compiler-rt project, I don't see a generator expression in that CMakeLists.txt. Is it something custom?

Also, the COMMAND_EXPAND_LISTS option in add_custom_command seems to partially allow the command to handle generator expressions. Maybe that is something worth trying out.

jvstech commented 3 years ago

My apologies; I should have checked our commit log before I left. It was definitely a custom thing: I was trying to build the Compiler-RT generic builtins (from an in-tree LLVM project) as bitcode to see if it was a viable alternative to statically linking. I believe the problem manifested from here: https://github.com/llvm/llvm-project/blob/1a7d7f423e756922042753f8a2fdc0418b87381a/llvm/cmake/modules/HandleLLVMOptions.cmake#L64

Prior to LLVM 10, it worked fine because this line didn't exist (in this form): https://github.com/llvm/llvm-project/blob/34cd354ea967a6aecee4133fe43e4879355c6dbe/llvm/cmake/modules/HandleLLVMOptions.cmake#L59

Once I get back to work on Monday, I can tell you exactly what the problem was.

In the meantime, here's a minimal CMakeLists.txt example. CMakeLists.txt

compor commented 3 years ago

Thanks for the minimal CMakeLists.txt.

After digging around a bit, it seems that other people have had this issue, like here for example. There is PR over at Kitware's Gitlab associated with this issue that seems to have been merged in 3.19. That's another thing to try out.

Lastly, an example test repo for the aforementioned issue here.

jvstech commented 3 years ago

This certainly does seem more like an issue with CMake that your project just so happens to exercise. I agree this should be tested with the latest CMake release, but even with the upstream change, it's still a problem because the language expressions will always evaluate to false -- which will cause bad compile options to be generated. I'm going to test the add_library(target OBJECT ...) theory today (even though I'm still skeptical it will fix anything).

I know you're busy, so thank you so much for staying on this.

compor commented 3 years ago

So I've tried it with CMake 3.19 and it is still the same problem. Apparently, the issue that was resolved, which I linked previously, is still when TARGET_PROPERTY is used.

However, the add_custom_command does accept generator expressions, just not that specific one, as the documentation for COMPILE_LANGUAGE mentions.

I don't think there is an easy workaround for this.

I'm not sure I understand what you mean by :

it's still a problem because the language expressions will always evaluate to false -- which will cause bad compile options to be generated

Care to elaborate?

Lastly,

I'm going to test the add_library(target OBJECT ...) theory today (even though I'm still skeptical it will fix anything

I'm not following where this line of thought came from, but I'm pretty sure that it won't work. Till recently OBJECT targets had lacking support by CMake.

jvstech commented 3 years ago

Here is what I was able to come up with using an object library:

--- a/LLVMIRUtil.cmake  Fri Nov 20 14:53:05 2020
+++ b/LLVMIRUtil.cmake  Fri Nov 20 14:53:15 2020
@@ -135,13 +135,41 @@
     set(CMD_ARGS "-emit-llvm" ${IN_STANDARD_FLAGS} ${IN_LANG_FLAGS}
       ${IN_COMPILE_OPTIONS} ${CURRENT_COMPILE_FLAGS} ${CURRENT_DEFS}
       ${IN_INCLUDES})
-
-    add_custom_command(OUTPUT ${FULL_OUT_LLVMIR_FILE}
-      COMMAND ${LLVMIR_COMPILER}
-      ARGS ${CMD_ARGS} -c ${INFILE} -o ${FULL_OUT_LLVMIR_FILE}
-      DEPENDS ${INFILE}
-      IMPLICIT_DEPENDS ${LINKER_LANGUAGE} ${INFILE}
-      COMMENT "Generating LLVM bitcode ${OUT_LLVMIR_FILE}"
+    
+    set(tempObjName ${OUTFILE}-temp-object)
+    add_library(${tempObjName} OBJECT ${INFILE})
+    if (NOT "${LINKER_LANGUAGE}" STREQUAL "")
+      set_target_properties(${tempObjName} 
+        PROPERTIES 
+          LINKER_LANGUAGE ${LINKER_LANGUAGE})
+    endif()
+    if (NOT "${IN_STANDARD_FLAGS}" STREQUAL "" OR
+        NOT "${IN_LANG_FLAGS}" STREQUAL "" OR 
+        NOT "${CURRENT_COMPILE_FLAGS}" STREQUAL "")
+      set_target_properties(${tempObjName} PROPERTIES
+        COMPILE_FLAGS -emit-llvm ${IN_STANDARD_FLAGS} ${IN_LANG_FLAGS} ${CURRENT_COMPILE_FLAGS})
+    else()
+      set_target_properties(${tempObjName} PROPERTIES
+        COMPILE_FLAGS -emit-llvm)
+    endif()
+    if (NOT "${IN_COMPILE_OPTIONS}" STREQUAL "")
+      set_target_properties(${tempObjName} PROPERTIES
+        COMPILE_OPTIONS ${IN_COMPILE_OPTIONS})
+    endif()
+    if (NOT "${CURRENT_DEFS}" STREQUAL "")
+      set_target_properties(${tempObjName} PROPERTIES
+        COMPILE_DEFINITIONS ${CURRENT_DEFS})
+    endif()
+    if (NOT "${IN_INCLUDES}" STREQUAL "")
+      set_target_properties(${tempObjName} PROPERTIES
+        INCLUDE_DIRECTORIES ${IN_INCLUDES})
+    endif()
+    
+    add_custom_target(symlink-bc-${tempObjName} ALL
+      DEPENDS ${tempObjName}
+      COMMAND ${CMAKE_COMMAND} -E create_symlink $<TARGET_OBJECTS:${tempObjName}> ${OUT_LLVMIR_FILE}
+      WORKING_DIRECTORY ${WORK_DIR}
+      COMMENT "Symlinking bitcode file from object file ${OUT_LLVMIR_FILE}"
       VERBATIM)

     list(APPEND OUT_LLVMIR_FILES ${OUT_LLVMIR_FILE})

I'm sure it's not the most efficient way of handling this, but it does work around the problem. I don't think it necessitates an entire PR because -- as you mentioned -- the problem is specific to these particular generator expressions. However, by using add_library instead of add_custom_command, I was able to use them. And since they were able to be evaluated to their correct values, I was able to add the correct compile definitions (-UNDEBUG to allow assertions in release mode).

Till recently OBJECT targets had lacking support by CMake.

As far as I'm concerned, they're still very lacking. Setting the SUFFIX property on the object library target does absolutely nothing which is why I added that horrible symlinking code. When I originally wrote this work-around, I was renaming the object files to have the correct file extension, but that added the problem of always having to re-compile the source files (and copying them didn't seem much better).

I still say this isn't a problem with your project -- it's a problem with the underwhelming object library support CMake has.

compor commented 3 years ago

Well, this framework depends on CMake, so it's not doing us any good to say that its support is underwhelming.

When I started this around CMake version 3.2 the support for Object libraries was minimal and I think (although I might be mistaken) options like COMPILE_DEFINITIONS per target were not a thing yet.

Anyhow, this solution is not bad. The symlink code is rather CMake idiomatic. What worries me is the fragility of these symlinks and moving bitcode files around, which is the main purpose of this.

I have a rough solution using file(GENERATE) as we had discussed earlier on this thread. I'll try to set up a test branch with this later this week. The files are produced during the end of the generation step, so all seem to work, but I haven't thoroughly updated the whole library.

jvstech commented 3 years ago

What worries me is the fragility of these symlinks and moving bitcode files around, which is the main purpose of this.

That's my worry, as well, so I'm grateful for any better solution. I tried the file(GENERATE) method myself back when I originally submitted this issue and ran into a lot of filename collision issues which caused CMake to fail with a message about different configuration expressions being written to the same file multiple times. (I was able to work around this by adding the target architecture to the file name I was using.)

Incidentally, now that I'm able to review my commit logs, I remember the original problem: I was using the extracted compile options from the targets generated by your functions to pass them to a compile-testing function similar to try_compile_only. I ended up hitting the problem in my code before yours could execute. Unfortunately, this means I can't use the file(GENERATE) method as I have to test for compilation success before adding the relevant target, and file(GENERATE) would happen too late unless I were to add the whole thing as an external project -- which would require a major rewrite and add unnecessary complexity.

But that's certainly out of scope for this project.

As another possible alternative, since the linker language is known at the time of adding the custom target, it may be possible to do a simple substitution on those particular generator expressions and replace them with a 0 or 1 (for example, replacing $<COMPILE_LANGUAGE:C> with a 1 when the known LINKER_LANGUAGE is C). This is admittedly fragile and wouldn't work where multiple languages are specified (such as $<COMPILE_LANGUAGE:C,CUDA>). Regex could help with that, but it still boils down to being a string replacement.

compor commented 1 year ago

closing as no longer pertinent.