standardese / cppast

Library to parse and work with the C++ AST
Other
1.7k stars 164 forks source link

Compilation fails on Windows when building with UNICODE #146

Open saraedum opened 2 years ago

saraedum commented 2 years ago

Compiling cppast fails on Windows with:

%SRC_DIR%\src\libclang\libclang_parser.cpp(250): error C2664: 'TinyProcessLib::Process::Process(const std::vector<TinyProcessLib::Process::string_type,std::allocator<TinyProcessLib::Process::string_type>> &,const TinyProcessLib::Process::string_type &,std::function<void (const char *,size_t)>,std::function<void (const char *,size_t)>,bool,const TinyProcessLib::Config &) noexcept': cannot convert argument 1 from 'std::basic_string<char,std::char_traits<char>,std::allocator<char>>' to 'const std::vector<TinyProcessLib::Process::string_type,std::allocator<TinyProcessLib::Process::string_type>> &'
%SRC_DIR%\src\libclang\libclang_parser.cpp(251): note: Reason: cannot convert from 'std::basic_string<char,std::char_traits<char>,std::allocator<char>>' to 'const std::vector<TinyProcessLib::Process::string_type,std::allocator<TinyProcessLib::Process::string_type>>'
%SRC_DIR%\src\libclang\libclang_parser.cpp(251): note: No user-defined-conversion operator available that can perform this conversion, or the operator cannot be called

The problem here seems to be that cppast is calling into tiny-process-library with a std::string. However, when UNICODE is set on windows, tiny-process-library is expecting a std::wstring, namely it defines:

#ifdef _WIN32
  typedef unsigned long id_type; // Process id type
  typedef void *fd_type;         // File descriptor type
#ifdef UNICODE
  typedef std::wstring string_type;
#else
  typedef std::string string_type;
#endif

So, I don't know much about Windows but I guess this could essentially be fixed by using std::wstring to store the path to the clang binary on Windows. So, we probably need the same #ifdef to select which string type to use here.

If this sounds like a good plan to you @foonathan, I am happy to produce a PR for this.

foonathan commented 2 years ago

This just pushes the problem out to the user though. I think it's fine if we assume ASCII paths and just convert the string to wstring by static_cast'ing each char before calling tiny process.

saraedum commented 2 years ago

A patch that interprets string as UTF8 is below. It turned out that I am not building with UNICODE anymore in the end and I also don't have a Windows machine to test this out. So if somebody else runs into this problem, they might want to use that patch as a starting point.

From 64bdb0ce353c5ecca5e6eb741210391ec7e397be Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Julian=20R=C3=BCth?= <julian.rueth@fsfe.org>
Date: Sat, 17 Sep 2022 01:54:55 +0300
Subject: [PATCH] Fix compilation if tpl uses wstring for command lines

we just pretend that the input is UTF-8 encoded. If that's not the case
then there will likely be strange errors due to the wrong encoding.
---
 src/libclang/libclang_parser.cpp |  7 ++++---
 src/libclang/preprocessor.cpp    | 14 ++++++++++++--
 src/libclang/preprocessor.hpp    | 11 +++++++++++
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/src/libclang/libclang_parser.cpp b/src/libclang/libclang_parser.cpp
index b45290b..afa65a4 100644
--- a/src/libclang/libclang_parser.cpp
+++ b/src/libclang/libclang_parser.cpp
@@ -6,6 +6,7 @@
 #include <cstring>
 #include <fstream>
 #include <vector>
+#include <string>

 #include <clang-c/CXCompilationDatabase.h>
 #include <process.hpp>
@@ -245,10 +246,10 @@ libclang_compile_config::libclang_compile_config(const libclang_compilation_data

 namespace
 {
+
 bool is_valid_binary(const std::string& binary)
 {
-    tpl::Process process(
-        binary + " -v", "", [](const char*, std::size_t) {}, [](const char*, std::size_t) {});
+    tpl::Process process(detail::widen<tpl::Process::string_type>(binary) + detail::widen<tpl::Process::string_type>(" -v"));
     return process.get_exit_status() == 0;
 }

@@ -262,7 +263,7 @@ void add_default_include_dirs(libclang_compile_config& config)
 {
     std::string  verbose_output;
     tpl::Process process(
-        detail::libclang_compile_config_access::clang_binary(config) + " -x c++ -v -", "",
+        detail::widen<tpl::Process::string_type>(detail::libclang_compile_config_access::clang_binary(config)) + detail::widen<tpl::Process::string_type>(" -x c++ -v -"), tpl::Process::string_type(),
         [](const char*, std::size_t) {},
         [&](const char* str, std::size_t n) { verbose_output.append(str, n); }, true);
     process.write("", 1);
diff --git a/src/libclang/preprocessor.cpp b/src/libclang/preprocessor.cpp
index bebfc34..99825f1 100644
--- a/src/libclang/preprocessor.cpp
+++ b/src/libclang/preprocessor.cpp
@@ -10,6 +10,8 @@
 #include <cstring>
 #include <fstream>
 #include <unordered_map>
+#include <locale>
+#include <codecvt>

 #include <process.hpp>

@@ -373,7 +375,7 @@ std::string write_macro_file(const libclang_compile_config& c, const std::string

     auto         cmd = get_macro_command(c, full_path.c_str());
     tpl::Process process(
-        cmd, "", [&](const char* str, std::size_t n) { stream.write(str, std::streamsize(n)); },
+        detail::widen<tpl::Process::string_type>(cmd), tpl::Process::string_type(), [&](const char* str, std::size_t n) { stream.write(str, std::streamsize(n)); },
         diagnostic_logger);

     if (auto include_guard = get_include_guard_macro(full_path))
@@ -433,7 +435,7 @@ clang_preprocess_result clang_preprocess_impl(const libclang_compile_config& c,

     auto         cmd = get_preprocess_command(c, full_path.c_str(), macro_path);
     tpl::Process process(
-        cmd, "",
+        detail::widen<tpl::Process::string_type>(cmd), tpl::Process::string_type(),
         [&](const char* str, std::size_t n) {
             for (auto ptr = str; ptr != str + n; ++ptr)
                 if (*ptr == '\t')
@@ -1250,3 +1252,11 @@ detail::preprocessor_output detail::preprocess(const libclang_compile_config& co

     return result;
 }
+
+template <typename S>
+typename std::enable_if<std::is_same<S, std::wstring>::value, S>::type detail::widen(const std::string& s)
+{
+  // see https://stackoverflow.com/a/18597384/812379
+  std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
+  return converter.from_bytes(s);
+}
diff --git a/src/libclang/preprocessor.hpp b/src/libclang/preprocessor.hpp
index 826ccda..860088a 100644
--- a/src/libclang/preprocessor.hpp
+++ b/src/libclang/preprocessor.hpp
@@ -48,6 +48,17 @@ namespace detail

     preprocessor_output preprocess(const libclang_compile_config& config, const char* path,
                                    const diagnostic_logger& logger);
+
+    // Convert the input to a string of type S.
+    // If S is a string, return the input unchanged.
+    // Otherwise, pretend that the input is UTF-8 and convert to wstring.
+    template <typename S>
+    typename std::enable_if<std::is_same<S, std::string>::value, S>::type widen(const std::string& s) {
+      return s;
+    }
+
+    template <typename S>
+    typename std::enable_if<std::is_same<S, std::wstring>::value, S>::type widen(const std::string& s);
 } // namespace detail
 } // namespace cppast

-- 
2.37.1