Open dalaen opened 1 year ago
Hi @dalaen
Thanks for your report. The first thing I have tried is to assess if it had impact on the include dirs resolution:
def test_cmakedeps_semicolon_escape():
c = TestClient()
pkg = textwrap.dedent("""\
import os
from conan import ConanFile
from conan.tools.files import save
class Pkg(ConanFile):
def package(self):
save(self, os.path.join(self.package_folder, "sub1/f1.h"), "")
save(self, os.path.join(self.package_folder, "sub2/f2.h"), "")
def package_info(self):
self.cpp_info.includedirs = ["sub1", "sub2"]
""")
app = textwrap.dedent("""
from conan import ConanFile
from conan.tools.cmake import CMake
class Pkg(ConanFile):
settings = "os", "arch", "compiler", "build_type"
requires = "pkg/0.1"
generators = "CMakeDeps", "CMakeToolchain"
def build(self):
cmake = CMake(self)
cmake.configure()
cmake.build()
""")
c.save({"pkg/conanfile.py": pkg,
"app/conanfile.py": app,
"app/CMakeLists.txt": gen_cmakelists(appsources=["main.cpp"], find_package=["pkg"]),
"app/main.cpp": gen_function_cpp(name="main", includes=["f1", "f2"])
})
c.run("create pkg --name=pkg --version=0.1")
c.run("build app")
assert "conanfile.py: Running CMake.build()" in c.out
Apparently it is working fine, and it is not an issue for CMake to interpret and use the string correctly.
The issue would come from this expression in our generated .cmake file:
set_property(TARGET {{root_target_name}}
PROPERTY INTERFACE_INCLUDE_DIRECTORIES
$<$<CONFIG:{{configuration}}>:${{'{'}}{{pkg_name}}_INCLUDE_DIRS{{config_suffix}}}> APPEND)
I'd expect that the escape characters are to be used when the definition of the generator expression happens, and there is no semicolon in that expression at all. What I would not expect is to have to escape the contents of the variable at all. That would mean that for every generator expression that is used out there using variables contents, those contents should be replace all possible escape characters?
As a side question: why would you like to get those directories anyway? Why just target_link_libraries()
the target doesn't work?
Hi @memsharded,
Thanks for your reply and investigation!
I'd expect that the escape characters are to be used when the definition of the generator expression happens, and there is no semicolon in that expression at all. What I would not expect is to have to escape the contents of the variable at all. That would mean that for every generator expression that is used out there using variables contents, those contents should be replace all possible escape characters?
I would concur to that if I understand you 100%. The issue is here, I'm not generating the expression that's put into the generator-expression at all.
The input is:
self.cpp_info.includedirs = ["sub1", "sub2"]
and the output is:
$<$<CONFIG:Release>:PKG_ROOT/sub1;PKG_ROOT/sub2>
I have no chance, as a user, to escape the generated semicolon. Or did I get you wrong?
To your side question: we have some custom CMake functions which are collecting a CMake target's dependencies and include folders to provide them as command line arguments to an external program for generating code.
The command line argument string is built by looping over the include folders, and because the foreach()
loop in CMake takes a semicolon or a whitespace as a splitting token, we get to this issue here when it's combined with generator-expressions.
It will basically give:
$<$<CONFIG:Release>:PKG_ROOT/sub1
PKG_ROOT/sub2>
To your side question: we have some custom CMake functions which are collecting a CMake target's dependencies and include folders to provide them as command line arguments to an external program for generating code. The command line argument string is built by looping over the include folders, and because the foreach() loop in CMake takes a semicolon or a whitespace as a splitting token, we get to this issue here when it's combined with generator-expressions.
It seems that you are trying to interpret a generator expression at configure time, isn't it? For parsing the INTERFACE_INCLUDE_DIRECTORIES variable at configure time, having a escape character or not, would be similar. Consider that you could also get some value for it like:
$<$<CONFIG:Release>:PKG_ROOT/sub1;PKG_ROOT/sub2>$<$<CONFIG:Debug>:PKG_ROOT/sub1;PKG_ROOT/sub2>
How would the $<SEMICOLON>
replacement help there to get the list of folders? It seems you cannot simply iterate the value of this variable to get folders. Maybe while parsing of the variable, you might be dropping or filtering out the $<$<CONFIG:Release>
part?
I have tried it in the test:
def test_cmakedeps_semicolon_escape():
c = TestClient()
pkg = textwrap.dedent("""\
import os
from conan import ConanFile
from conan.tools.files import save
class Pkg(ConanFile):
def package(self):
save(self, os.path.join(self.package_folder, "sub1/f1.h"), "")
save(self, os.path.join(self.package_folder, "sub2/f2.h"), "")
def package_info(self):
self.cpp_info.includedirs = ["sub1", "sub2"]
""")
app = textwrap.dedent("""
from conan import ConanFile
from conan.tools.cmake import CMake
class Pkg(ConanFile):
settings = "os", "arch", "compiler", "build_type"
requires = "pkg/0.1"
generators = "CMakeDeps", "CMakeToolchain"
def build(self):
cmake = CMake(self)
cmake.configure()
cmake.build()
""")
cmake = textwrap.dedent("""
set(CMAKE_CXX_COMPILER_WORKS 1)
set(CMAKE_CXX_ABI_COMPILED 1)
set(CMAKE_C_COMPILER_WORKS 1)
set(CMAKE_C_ABI_COMPILED 1)
cmake_minimum_required(VERSION 3.15)
project(project CXX)
find_package(pkg)
get_target_property(output pkg::pkg INTERFACE_INCLUDE_DIRECTORIES)
message(STATUS "${output}")
foreach(_include ${output})
message(STATUS "INCLUDE=${_include}")
endforeach()
add_executable(myapp main.cpp )
target_include_directories(myapp PUBLIC "include")
target_link_libraries(myapp pkg::pkg )
""")
c.save({"pkg/conanfile.py": pkg,
"app/conanfile.py": app,
"app/CMakeLists.txt": cmake,
"app/main.cpp": gen_function_cpp(name="main", includes=["f1", "f2"])
})
c.run("create pkg --name=pkg --version=0.1")
c.run("build app")
print(c.out)
assert "conanfile.py: Running CMake.build()" in c.out
And the output (with a change in the CMakeDeps to change it to SEMICOLON is still:
-- INCLUDE=$<$<CONFIG:Release>:T:/tmp19ifl7zsconans/path with spaces/.conan2/p/b/pkg4e0fcf4088a89/p/sub1$<SEMICOLON>T:/tmp19ifl7zsconans/path with spaces/.conan2/p/b/pkg4e0fcf4088a89/p/sub2>
So this change seems it will not help you to get the iterate the different folders anyway?
The generator-expression is interpreted at build time when the target having the add_custom_command
is built.
In your example above, you are trying to interpret it at generation time, where the generation-expression is still not interpreted.
I made a sample CMakeLists.txt to illustrate all of this, and it actually brings a lot of light:
cmake_minimum_required(VERSION 3.27)
project(test LANGUAGES NONE)
message(STATUS "### Generating unescaped ###")
set(var_unescaped "$<$<BOOL:True>:/path/to/lib1;/path/to/lib2>;/path/to/lib3;$<$<BOOL:True>:/path/to/lib4>")
set(codegen_args_unescaped "")
foreach(var ${var_unescaped})
message(STATUS "Appending ${var}")
set(codegen_args_unescaped ${codegen_args_unescaped} "-I${var}")
endforeach()
message(STATUS "codegen_args_unescaped=${codegen_args_unescaped}")
file(GENERATE OUTPUT unescaped CONTENT "${codegen_args_unescaped}")
message(STATUS "##########################")
message(STATUS "### Generating escaped ###")
set(var_escaped "$<$<BOOL:True>:path/to/lib1$<SEMICOLON>path/to/lib2>;/path/to/lib3;$<$<BOOL:True>:/path/to/lib4>")
set(codegen_args_escaped "")
foreach(var ${var_escaped})
message(STATUS "Appending ${var}")
set(codegen_args_escaped ${codegen_args_escaped} "-I${var}")
endforeach()
message(STATUS "codegen_args_escaped=${codegen_args_escaped}")
file(GENERATE OUTPUT escaped CONTENT "${codegen_args_escaped}")
Running cmake .
:
-- ### Generating unescaped ###
-- Appending $<$<BOOL:True>:/path/to/lib1
-- Appending /path/to/lib2>
-- Appending /path/to/lib3
-- Appending $<$<BOOL:True>:/path/to/lib4>
-- codegen_args_unescaped=-I$<$<BOOL:True>:/path/to/lib1;-I/path/to/lib2>;-I/path/to/lib3;-I$<$<BOOL:True>:/path/to/lib4>
-- ##########################
-- ### Generating escaped ###
-- Appending $<$<BOOL:True>:path/to/lib1$<SEMICOLON>path/to/lib2>
-- Appending /path/to/lib3
-- Appending $<$<BOOL:True>:/path/to/lib4>
-- codegen_args_escaped=-I$<$<BOOL:True>:path/to/lib1$<SEMICOLON>path/to/lib2>;-I/path/to/lib3;-I$<$<BOOL:True>:/path/to/lib4>
-- Configuring done (0.0s)
-- Generating done (0.0s)
Content of the generated files:
unescaped
-I/path/to/lib1;-I/path/to/lib2;-I/path/to/lib3;-I/path/to/lib4
escaped
-Ipath/to/lib1;path/to/lib2;-I/path/to/lib3;-I/path/to/lib4
Actually, for my use case, I want the behavior of... unescaped!
However, the "unescaped" file in this example is as such because it has been dissembled with the foreach
and reassembled identically (basically $<$<BOOL:True>:/path/to/lib1;/path/to/lib2>
became -I$<$<BOOL:True>:/path/to/lib1;-I/path/to/lib2>
, and that's a special case.
In our custom CMake scripts, we are filtering input in the foreach
which then tempers the disassembly/reassembly…
On one side, doing a foreach
when the semicolon is not escaped yields weird results:
-- Appending $<$<BOOL:True>:/path/to/lib1
-- Appending /path/to/lib2>
-- Appending /path/to/lib3
-- Appending $<$<BOOL:True>:/path/to/lib4>
On the other side, escaping the semicolon binds together lib1 and lib2 (in this example) in a foreach
, which might not be the expected behavior as well.
-- Appending $<$<BOOL:True>:path/to/lib1$<SEMICOLON>path/to/lib2>
-- Appending /path/to/lib3
-- Appending $<$<BOOL:True>:/path/to/lib4>
The optimal behavior would be, then, I guess, a genex for each lib: $<$<BOOL:True>:/path/to/lib1>;$<$<BOOL:True>:/path/to/lib2>
giving the following output:
-- Appending $<$<BOOL:True>:path/to/lib1>
-- Appending $<$<BOOL:True>:/path/to/lib2>
-- Appending /path/to/lib3
-- Appending $<$<BOOL:True>:/path/to/lib4>
This is very interesting indeed!
Generating one genex for each library might be a bit challenging with the current implementation of CMakeDeps, as the generation of targets is decoupled from the generation of libraries lists. We could parse, split and reassemble things, but sounds it adds quite a bit of extra complexity.
I am trying to think a bit outside of the box, collecting the include-dirs from dependencies in Conan is very straightforward, what about something in this line:
def generate(self):
# This can be made a 2 lines, made it this way to be more clear
mycmake = []
for dep in self.dependencies.host.values():
includedirs = dep.cpp_info.includedirs
for inc in includedirs:
mycmake.append(...inc in the right format)
save(self, "mycmake.cmake", "\n".join(mycmake))
And directly generate the file that you want, with the exact format you want, and then you can leverage that and use it in your add_custom_command
.?
Generating one genex for each library might be a bit challenging with the current implementation of CMakeDeps, as the generation of targets is decoupled from the generation of libraries lists. We could parse, split and reassemble things, but sounds it adds quite a bit of extra complexity.
Indeed, though it feels like the correct answer. Well, the genex is introduced in the generation of targets in CMakeDeps, maybe the library list can be sanitized there (actually I would not expect this to happen in earlier steps, where it might not end up in a genex?).
Collecting the list of includedirs through the conanfile.py
would not really be suited for my use case.
Basically our current CMake function takes a CMake target as an input, collects its dependencies, assembles the include directories of the dependencies together, does some filtering and pass it to a code generator.
That'd be unfortunately out of scope of the conanfile.py
where CMake targets are still not born, and quite heavy work for a CMake function located in a dependency (that we fortunately control, but not in the consumer/downstream project, rather in a lib/upstream project).
I am also not blocked at the moment, so not in search of a short-term solution, as I can probably get rid of the filtering, hence falling back into the special case of disassembling and reassembling 1-to-1. :)
That'd be unfortunately out of scope of the conanfile.py where CMake targets are still not born, and quite heavy work for a CMake function located in a dependency (that we fortunately control, but not in the consumer/downstream project, rather in a lib/upstream project).
That generate()
only happens in the consumer, not necessary to modify the upstream, and it also contains the information about the targets that will be created in CMake, so it is easy to create a file that defines a list of includes per target name. Using that in your own CMakeLists would be just getting the value of a variable from a file that you can include()
.
But it is very likely that I don't have all the information and constraints, so not an issue, I was just trying to think if there could be other more convenient ways to approach this problem. I have had (and I have seen many users) having quite good success in moving complicated functionality belonging to the "generate" stage (gathering and processing information from dependencies), from CMakeLists.txt or other obscure build systems to the conanfile.py
, so I thought this could have been a possible approach.
I am also not blocked at the moment, so not in search of a short-term solution, as I can probably get rid of the filtering, hence falling back into the special case of disassembling and reassembling 1-to-1. :)
Ok, sounds good, happy to know that you are not blocked :)
Environment details
Steps to reproduce
conanfile.py (standard with "mariadb-connector-c/3.3.3" as a dependency)
CMakeLists.txt:
Run:
conan install .
conan build .
Logs
Running
conan build .
displays the following output (trimming the irrelevant warnings):The issue is the last line, showing the content of the generated INTERFACE_INCLUDE_DIRECTORIES of the
mariadb-connector-c::mariadb-connector-c
, generated by CMakeDeps. The semicolon in the middle can cause trouble when one loops over the variable. In a generator expression like this, it should be escaped/replaced with$<SEMICOLON>
(doc: https://cmake.org/cmake/help/latest/manual/cmake-generator-expressions.7.html#escaped-characters).Instead of generating:
generate