conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.27k stars 981 forks source link

Conanfile's package_folder method uses single backslash on windows, not compatible with CMake #4932

Closed trondhe closed 11 months ago

trondhe commented 5 years ago

Not sure if its unintended or what actually is correct, but conanfile's self.package_folder uses single backslash on windows, which CMake does not enjoy. That means that any path I set that uses that variable needs to be converted to either use double blackslash or single forward slash.

There are probably software that would break no matter the solution, \, \\ or /. But arent the two latter ones in general better supported over the former, and should probably be used?

jgsogo commented 5 years ago

Hi, @trondhe!

How did you come across this issue? Is it in the Conan code or are you using that string in one of your recipes to pass some information to a CMake variable?

trondhe commented 5 years ago

I have two conan packages for packing a full toolchain, one contains the compiler itself and the other the toolchain files for cmake. As there is no invoking of the actual CMake executable at this stage, the way I propegate the toolchain file to CMake and then the path to the actual compiler for the toolchain file is via ENV variables. setting self.env_info.compilerPath = self.package_folder in my compiler package it will use the single backslash version. So for every path variable that my compiler needs in environment variables, needs to either replace all \ with \\ or /.

jgsogo commented 5 years ago

I think it makes sense to ensure that those paths provided by Conan as member attributes of the conanfile (source_folder, build_folder, package_folder and install_folder) contains only single forward slahes (valid for Win and Linux, and CMake). It will require some engineering work in the codebase.

Meanwhile, I can only recommend you to do the substitution yourself, probably in the assignment better than in the consumer: self.env_info.compilerPath = self.package_folder.replace("\\", "/")

Thanks for reporting it!

Johnnyxy commented 5 years ago

Hi there,

general

The path CMake gets should not be with one backslash (in memory). Please recheck the CMake variable in your CMakeLists.txt:

message (FATAL_ERROR "CMAKE_INSTALL_PREFIX = ${CMAKE_INSTALL_PREFIX}")

If it says something like "C:\.conan\..." everything is fine, as the output already has been "unescaped".

Python uses the C-String semantics regarding escape sequences. This means defining a string like this self.package_folder = "C:\\abc\\def\\ghi" is correct. This has nothing to do with CMake. To define a path one has to write forward slashes or escape them or use raw python strings self.package_folder = r"C:\abc\def\ghi".

If correctly defined in Python string escaping is not really an issue as CMake does use the same escaping rules as C/C++/Python (e.g. \\ -> \; \n -> newline; ...). So CMake reads the given string as Python does. And if CMake does use escaping or not cannot be influenced by Conan.

Details

CMake internally uses forward slashes (see). So providing paths with forward slashes would be perfectly valid. When CMake calls some API from the OS it automatically does the converting of the string if necessary.

I do not know if CMake does this on Windows as it is not necessary in general there as the Win32 API can work with forward slashes.

trondhe commented 5 years ago

Yeah, I am personally converting all my paths with a single forward slash. Those type of paths are exclusively used in the codebase for everything on my side. The workaround I'm using is

mypath = "\some\\path/to\a\\location"
mypath = mypath.replace(os.sep, '/')
print(mypath) #prints "/some/path/to/a/location"
jgsogo commented 5 years ago

I'm checking with a minimum example, and it works, so I think this not an issue to be solved in Conan but in the consumer (or in the recipe):

Populating some CMake variables

    def build(self):
        cmake = CMake(self)
        cmake.definitions["SOURCE_FOLDER"] = self.source_folder
        cmake.configure(source_folder="src")
        cmake.build()

and consuming them in a CMakeLists.txt:


message(">>>> ${SOURCE_FOLDER}")
if (EXISTS ${SOURCE_FOLDER})
     message("exists!")
endif()

CMake is able to interpret the path correctly.

trondhe commented 5 years ago

I think the point is that not all of CMake does support backslash. In my example its CMakeCXXCompiler.cmake file that does not enjoy my path D:\conandata\... and complains that \c is an invalid escape character.
Is this potentially a problem with CMake? I don't know. If their policy is not to guarantee full support of single backslash, it would not be.
This change is mainly to remove hassle and edge case problems that might occur of conan's part by using single forward as default, as issues with backslash seems to be an recurring issue for lesser used parts of tools.

jgsogo commented 5 years ago

Hi! I'm trying to reproduce a use case that fails. Can you tell me which variable are you assigning your path to? Or even better, can you share with us the relevant snippets to reproduce it?

Thanks!

trondhe commented 5 years ago

Disclaimer: This is partly from a cross compiling setup, but unnessecary details removed, so it might look at bit weird as a "specific" implementation, but should demonstrate the issue.

If in any toolchain file I set

set(CMAKE_CXX_COMPILER "c:\var\llvm\bin\clang-cl")

I will get an parse error on the "\v" from cmake telling me that it fails with Invalid character escape \v.

If I instead set an environment variable set MYCOMPILER=c:\var\llvm\bin and then run cmake with this in my cmake toolchain file

set(CMAKE_CXX_COMPILER  "$ENV{MYCOMPILER}/clang-cl")
set(CMAKE_LINKER            "$ENV{MYCOMPILER}/ld")
set(CMAKE_AR            "$ENV{MYCOMPILER}/ar")
set(CMAKE_NM            "$ENV{MYCOMPILER}/nm")
set(CMAKE_STRIP         "$ENV{MYCOMPILER}/strip")
set(CMAKE_RANLIB            "$ENV{MYCOMPILER}/ranlib")
set(CMAKE_OBJCOPY       "$ENV{MYCOMPILER}/objcopy")
set(CMAKE_OBJDUMP       "$ENV{MYCOMPILER}/objdump")

I instead get an error at the CMakeCXXCompiler.cmake file saying

CMake Error at build/CMakeFiles/3.14.1/CMakeCXXCompiler.cmake:21 (set):
  Syntax error in cmake code at

    D:/myproject/build/CMakeFiles/3.14.1/CMakeCXXCompiler.cmake:21

  when parsing string

    D:\vxworks_7.0/compilers/llvm-7.0.0.1/WIN64/bin/ar

  Invalid character escape '\v'

So it actually accepts the compiler set command, but failes on the other gcc compiler components. Anyhow, it seems that CMake is not very stable parsing paths with backslashes, and just standardizing everything around forward slashes seems to be a good idea.

Johnnyxy commented 5 years ago

Hi there,

firstly one should not specify a path (relative path too) in the CMake toolchain file if possible. The reason behind this is that CMake tries to find/execute the compiler and will error out on compiler/feature detection. So the common way is to specify just the command to execute (gcc, ar, ranlib, ...) or to specify a full path to the binary including the file extension (if using Windows). I had to learn that the hard way when using the "pure" GCC port for the QNX system. Additionally this compiler has no support for spaces in the paths at all. It is recommended to specify the compiler paths in the global search path (Windows: PATH environment variable), just execute (gcc, ar, ranlib, ...) let the system and CMake deal with the paths.

Secondly CMake is quite good in handling paths as it does not play around with the paths you provide unless you explicitly command it to. What is my preferred way as this gives the CMake coder control over the path strings.

In general: If you are working with paths in CMake then have a look at the file function of CMake (see here). As Windows supports forward and backward slashes one can always resolve the paths to "TO_CMAKE_PATH" what in turn converts a path to a unix-like one. Normally a conversion to a native path is not nessesscary (like for execute_process or such CMake functions).

common toolchain

If possible set the search path to the path of the compiler binaries before executing CMake:

set PATH=%MYCOMPILER%;%PATH%

Then use a generic toolchain file:

set(CMAKE_CXX_COMPILER "clang-cl")
set(CMAKE_LINKER       "ld")
set(CMAKE_AR           "ar")
set(CMAKE_NM           "nm")
set(CMAKE_STRIP        "strip")
set(CMAKE_RANLIB       "ranlib")
set(CMAKE_OBJCOPY      "objcopy")
set(CMAKE_OBJDUMP      "objdump")

With this approach there will not be any path problems.

specific toolchain

I would not recommend the following toolchain file but it could help you if you cannot modify safely the environment search path. If you have to set cached variables you can use the set and just set the variable to its own content with the CACHE parameter.

file (TO_CMAKE_PATH "$ENV{MYCOMPILER}/clang-cl" CMAKE_CXX_COMPILER)
file (TO_CMAKE_PATH "$ENV{MYCOMPILER}/ld" CMAKE_LINKER)
file (TO_CMAKE_PATH "$ENV{MYCOMPILER}/ar" CMAKE_AR)
file (TO_CMAKE_PATH "$ENV{MYCOMPILER}/nm" CMAKE_NM)
file (TO_CMAKE_PATH "$ENV{MYCOMPILER}/strip" CMAKE_STRIP)
file (TO_CMAKE_PATH "$ENV{MYCOMPILER}/ranlib" CMAKE_RANLIB)
file (TO_CMAKE_PATH "$ENV{MYCOMPILER}/objcopy" CMAKE_OBJCOPY)
file (TO_CMAKE_PATH "$ENV{MYCOMPILER}/objdump" CMAKE_OBJDUMP)

conclusion

I would too recommend that Conan would use a sanitized path handling e.g. for the cpp_info object and such. But in the CMake code one can never rely on the fact that the paths are correctly formatted and should at least convert used paths to a CMake standardized path format. As the CMake code could be executed from command line seperately without Conan and then you would have the same problem with the paths.

trondhe commented 5 years ago

Keep in mind that what I posed was a "psuedo"-way of how I am doing it, its a bit more to it than that, but using the file function to make it more robust seems to be a good idea, thanks!

Only setting the executable name for the compiler, linker etc. seems very error prone to me. On the systems I am working on there is multiple sets of compilers for other use-cases, and CMake have generally problems actually not doing something weird when I am not explicit (works fine of course by just using CXX env variable for basic use cases). The problem is generally when it comes to VxWorks. As I am managing the toolchain files and compiler package with conan, the path needs to be absolute.

Johnnyxy commented 5 years ago

Error prone is the point of view here :). I myself do not like this too that something is not specifically set. But in this case it is the right way to do. Toolchain files are a general tool for CMake to describe a compile system. With the QCC compiler (QNX GCC variant) I had huge problems to get everything to run with set paths. And I have to provide toolchain files for (at the moment) three compilers, three operating systems, multiple architectures of them and to be able to compile some of them on Windows and Linux. This would be such a burden to provide a copy of those toolchain files just to set specific paths to the compilers. Additionally this would not work if the paths are different on another machine, like a CI system. Then I would have to maintain toolchain files for dev machines and CI servers etc.

One has to assume that the toolchain file could be used on another system (e.g. Windows and Linux) without any changes. Then a path set explicitly will lead to much redundandency with many toolchain files. For example building on Windows with plain Console, MinGW or Cygwin environment differs significantly. With setting paths you impose the work to maintain at least three toolchain files only for one specific compiler/compile-target-combination.

Generally if you provide the path to the binaries in the PATH variable always the first found result will be used. Even with CMake, it cannot do any magic only the binary name. As it does not know what the binary actually is (a compiler, a batch file, etc.). So it has to rely on system functions to find and execute it. This means on Windows you can check the results with the where command in the console. Just !prefix! the compiler path to the compiler binaries and the where command as well as CMake will find it there and stop searching for more.

corner case

If you have the following case than it would be more complicated where you have to use different binaries with similar names from multiple folders in one CMake invocation. This would not work without setting paths. Executing "def.exe" from the "path2" would be impossible here: set PATH=path1;path2;%PATH%

  1. path1
    • abc.exe
    • def.exe - problem
    • ghi.exe
  2. path2
    • def.exe - problem
    • klm.exe
    • nop.exe

But if one does not have such a case the solution should be to set the right order in the path variable and let the system handle the paths.

memsharded commented 11 months ago

Some time has passed since this ticket, that now seems outdated: Conan CMake integrations have changed to CMakeToolchain and CMakeDeps, so now there are better, more cmake-native alternatives (like using CMake toolchain files). Also the environment management is new. Closing this ticket as outdated, please don't hesitate to create new tickets as necessary for any further question or issue you might have, thanks!