Open saraedum opened 2 years ago
This just pushes the problem out to the user though. I think it's fine if we assume ASCII paths and just convert the string to wstring by static_cast'ing each char before calling tiny process.
A patch that interprets string as UTF8 is below. It turned out that I am not building with UNICODE anymore in the end and I also don't have a Windows machine to test this out. So if somebody else runs into this problem, they might want to use that patch as a starting point.
From 64bdb0ce353c5ecca5e6eb741210391ec7e397be Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Julian=20R=C3=BCth?= <julian.rueth@fsfe.org>
Date: Sat, 17 Sep 2022 01:54:55 +0300
Subject: [PATCH] Fix compilation if tpl uses wstring for command lines
we just pretend that the input is UTF-8 encoded. If that's not the case
then there will likely be strange errors due to the wrong encoding.
---
src/libclang/libclang_parser.cpp | 7 ++++---
src/libclang/preprocessor.cpp | 14 ++++++++++++--
src/libclang/preprocessor.hpp | 11 +++++++++++
3 files changed, 27 insertions(+), 5 deletions(-)
diff --git a/src/libclang/libclang_parser.cpp b/src/libclang/libclang_parser.cpp
index b45290b..afa65a4 100644
--- a/src/libclang/libclang_parser.cpp
+++ b/src/libclang/libclang_parser.cpp
@@ -6,6 +6,7 @@
#include <cstring>
#include <fstream>
#include <vector>
+#include <string>
#include <clang-c/CXCompilationDatabase.h>
#include <process.hpp>
@@ -245,10 +246,10 @@ libclang_compile_config::libclang_compile_config(const libclang_compilation_data
namespace
{
+
bool is_valid_binary(const std::string& binary)
{
- tpl::Process process(
- binary + " -v", "", [](const char*, std::size_t) {}, [](const char*, std::size_t) {});
+ tpl::Process process(detail::widen<tpl::Process::string_type>(binary) + detail::widen<tpl::Process::string_type>(" -v"));
return process.get_exit_status() == 0;
}
@@ -262,7 +263,7 @@ void add_default_include_dirs(libclang_compile_config& config)
{
std::string verbose_output;
tpl::Process process(
- detail::libclang_compile_config_access::clang_binary(config) + " -x c++ -v -", "",
+ detail::widen<tpl::Process::string_type>(detail::libclang_compile_config_access::clang_binary(config)) + detail::widen<tpl::Process::string_type>(" -x c++ -v -"), tpl::Process::string_type(),
[](const char*, std::size_t) {},
[&](const char* str, std::size_t n) { verbose_output.append(str, n); }, true);
process.write("", 1);
diff --git a/src/libclang/preprocessor.cpp b/src/libclang/preprocessor.cpp
index bebfc34..99825f1 100644
--- a/src/libclang/preprocessor.cpp
+++ b/src/libclang/preprocessor.cpp
@@ -10,6 +10,8 @@
#include <cstring>
#include <fstream>
#include <unordered_map>
+#include <locale>
+#include <codecvt>
#include <process.hpp>
@@ -373,7 +375,7 @@ std::string write_macro_file(const libclang_compile_config& c, const std::string
auto cmd = get_macro_command(c, full_path.c_str());
tpl::Process process(
- cmd, "", [&](const char* str, std::size_t n) { stream.write(str, std::streamsize(n)); },
+ detail::widen<tpl::Process::string_type>(cmd), tpl::Process::string_type(), [&](const char* str, std::size_t n) { stream.write(str, std::streamsize(n)); },
diagnostic_logger);
if (auto include_guard = get_include_guard_macro(full_path))
@@ -433,7 +435,7 @@ clang_preprocess_result clang_preprocess_impl(const libclang_compile_config& c,
auto cmd = get_preprocess_command(c, full_path.c_str(), macro_path);
tpl::Process process(
- cmd, "",
+ detail::widen<tpl::Process::string_type>(cmd), tpl::Process::string_type(),
[&](const char* str, std::size_t n) {
for (auto ptr = str; ptr != str + n; ++ptr)
if (*ptr == '\t')
@@ -1250,3 +1252,11 @@ detail::preprocessor_output detail::preprocess(const libclang_compile_config& co
return result;
}
+
+template <typename S>
+typename std::enable_if<std::is_same<S, std::wstring>::value, S>::type detail::widen(const std::string& s)
+{
+ // see https://stackoverflow.com/a/18597384/812379
+ std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
+ return converter.from_bytes(s);
+}
diff --git a/src/libclang/preprocessor.hpp b/src/libclang/preprocessor.hpp
index 826ccda..860088a 100644
--- a/src/libclang/preprocessor.hpp
+++ b/src/libclang/preprocessor.hpp
@@ -48,6 +48,17 @@ namespace detail
preprocessor_output preprocess(const libclang_compile_config& config, const char* path,
const diagnostic_logger& logger);
+
+ // Convert the input to a string of type S.
+ // If S is a string, return the input unchanged.
+ // Otherwise, pretend that the input is UTF-8 and convert to wstring.
+ template <typename S>
+ typename std::enable_if<std::is_same<S, std::string>::value, S>::type widen(const std::string& s) {
+ return s;
+ }
+
+ template <typename S>
+ typename std::enable_if<std::is_same<S, std::wstring>::value, S>::type widen(const std::string& s);
} // namespace detail
} // namespace cppast
--
2.37.1
Compiling cppast fails on Windows with:
The problem here seems to be that cppast is calling into tiny-process-library with a
std::string
. However, when UNICODE is set on windows, tiny-process-library is expecting astd::wstring
, namely it defines:So, I don't know much about Windows but I guess this could essentially be fixed by using std::wstring to store the path to the clang binary on Windows. So, we probably need the same
#ifdef
to select which string type to use here.If this sounds like a good plan to you @foonathan, I am happy to produce a PR for this.