Closed mxmlnkn closed 1 year ago
That's a bit of a nasty one. Do you know which version of libstdc++ is running? I could try and reproduce it by building a shared library that I dlopen.
I think if it is a static initialisation problem then the alternative is to either do what you said with a singleton or to wrap them up in some object that is only constructed when you construct an options parser.
Do you know which version of libstdc++ is running?
Based on the backtrace it looks like libstdc++ is not linked statically into the indexed_bzip2 shared library because /usr/lib/x86_64-linux-gnu/libstdc++.so.6
is used. This backtrace was supposedly from inside an Ubuntu 22.04 Docker image. Based on that information and the so path, which also exists on my local Ubuntu 22.04, it seems to be provided by this package libstdc++6:amd64
and links to libstdc++.so.6.0.30
. I'd say that version 6.0.30 is being used.
I could try and reproduce it by building a shared library that I dlopen.
According to the user, a simple import via Python did not even trigger the issue instead it only happened deep inside some more complex software. Unfortunately there is no minimal reproducer known to me yet. It might be something that happens because of multiple dlopens, maybe even from multiple threads?
I have updated my library to use v3.1.1 and took a look into avoiding those static variables. I noticed that each regex variable is only ever used exactly inside one function, so it isn't the major refactor I thought it to be and simply making those global variables static function-scope variables should circumvent the problem without any downsides. I guess the only downside would be that they are not listed right next to each other but I guess even that could be fixed by crating static constexpr string_view
or const char*
global variables holding the pattern and then initialize the static function-scope regexes with those patterns.
Here is my patch, trying to be least invasive. It still contains trailing whitespace fixes because my editor removes those automatically.
diff --git a/include/cxxopts.hpp b/include/cxxopts.hpp
index b789a5c..aff0d44 100644
--- a/include/cxxopts.hpp
+++ b/include/cxxopts.hpp
@@ -55,8 +55,8 @@ THE SOFTWARE.
#define CXXOPTS_LINKONCE_CONST __declspec(selectany) extern
#define CXXOPTS_LINKONCE __declspec(selectany) extern
#else
-#define CXXOPTS_LINKONCE_CONST
-#define CXXOPTS_LINKONCE
+#define CXXOPTS_LINKONCE_CONST
+#define CXXOPTS_LINKONCE
#endif
#ifndef CXXOPTS_NO_REGEX
@@ -758,29 +758,31 @@ inline ArguDesc ParseArgument(const char *arg, bool &matched)
namespace {
CXXOPTS_LINKONCE
-std::basic_regex<char> integer_pattern
- ("(-)?(0x)?([0-9a-zA-Z]+)|((0x)?0)");
+const char* const integer_pattern =
+ "(-)?(0x)?([0-9a-zA-Z]+)|((0x)?0)";
CXXOPTS_LINKONCE
-std::basic_regex<char> truthy_pattern
- ("(t|T)(rue)?|1");
+const char* const truthy_pattern =
+ "(t|T)(rue)?|1";
CXXOPTS_LINKONCE
-std::basic_regex<char> falsy_pattern
- ("(f|F)(alse)?|0");
+const char* const falsy_pattern =
+ "(f|F)(alse)?|0";
CXXOPTS_LINKONCE
-std::basic_regex<char> option_matcher
- ("--([[:alnum:]][-_[:alnum:]\\.]+)(=(.*))?|-([[:alnum:]].*)");
+const char* const option_pattern =
+ "--([[:alnum:]][-_[:alnum:]\\.]+)(=(.*))?|-([[:alnum:]].*)";
CXXOPTS_LINKONCE
-std::basic_regex<char> option_specifier
- ("([[:alnum:]][-_[:alnum:]\\.]*)(,[ ]*[[:alnum:]][-_[:alnum:]]*)*");
+const char* const option_specifier_pattern =
+ "([[:alnum:]][-_[:alnum:]\\.]*)(,[ ]*[[:alnum:]][-_[:alnum:]]*)*";
CXXOPTS_LINKONCE
-std::basic_regex<char> option_specifier_separator(", *");
+const char* const option_specifier_separator_pattern = ", *";
} // namespace
inline IntegerDesc SplitInteger(const std::string &text)
{
+ static const std::basic_regex<char> integer_matcher(integer_pattern);
+
std::smatch match;
- std::regex_match(text, match, integer_pattern);
+ std::regex_match(text, match, integer_matcher);
if (match.length() == 0)
{
@@ -804,15 +806,17 @@ inline IntegerDesc SplitInteger(const std::string &text)
inline bool IsTrueText(const std::string &text)
{
+ static const std::basic_regex<char> truthy_matcher(truthy_pattern);
std::smatch result;
- std::regex_match(text, result, truthy_pattern);
+ std::regex_match(text, result, truthy_matcher);
return !result.empty();
}
inline bool IsFalseText(const std::string &text)
{
+ static const std::basic_regex<char> falsy_matcher(falsy_pattern);
std::smatch result;
- std::regex_match(text, result, falsy_pattern);
+ std::regex_match(text, result, falsy_matcher);
return !result.empty();
}
@@ -821,22 +825,25 @@ inline bool IsFalseText(const std::string &text)
// (without considering which or how many are single-character)
inline OptionNames split_option_names(const std::string &text)
{
- if (!std::regex_match(text.c_str(), option_specifier))
+ static const std::basic_regex<char> option_specifier_matcher(option_specifier_pattern);
+ if (!std::regex_match(text.c_str(), option_specifier_matcher))
{
throw_or_mimic<exceptions::invalid_option_format>(text);
}
OptionNames split_names;
+ static const std::basic_regex<char> option_specifier_separator_matcher(option_specifier_separator_pattern);
constexpr int use_non_matches { -1 };
auto token_iterator = std::sregex_token_iterator(
- text.begin(), text.end(), option_specifier_separator, use_non_matches);
+ text.begin(), text.end(), option_specifier_separator_matcher, use_non_matches);
std::copy(token_iterator, std::sregex_token_iterator(), std::back_inserter(split_names));
return split_names;
}
inline ArguDesc ParseArgument(const char *arg, bool &matched)
{
+ static const std::basic_regex<char> option_matcher(option_pattern);
std::match_results<const char*> result;
std::regex_match(arg, result, option_matcher);
matched = !result.empty();
@@ -1551,7 +1558,7 @@ class ParseResult
Iterator(const Iterator&) = default;
// GCC complains about m_iter not being initialised in the member
-// initializer list
+// initializer list
CXXOPTS_DIAGNOSTIC_PUSH
CXXOPTS_IGNORE_WARNING("-Weffc++")
Iterator(const ParseResult *pr, bool end=false)
Fixed with #406.
Hello,
I'm using cxxopts v2.2.1 for argument parsing in indexed_bzip2 and got a very weird user-reported issue.
This seems to be a similar issue to the 6 years old #88. The solution there seems to be to ensure that something newer than GCC 4.8 is used. However, based on the line information in the backtrace
/opt/rh/gcc-toolset-11/root/usr/include/c++/11/
, I'm pretty sure that GCC 11 was used to compile it. The shared library is built inside manylinux2014 Docker container and then installed as a Python wheel on the user system, which is a Ubuntu 22.04 Docker container.Do you have any idea what to do to fix this?
I doubt that updating cxxopts helps much because the regex is unchanged since v2.2.1.
Personally, I would have preferred those regexes to be initialized on a first-use basis, e.g., by implementing a singleton pattern instead of static initialization. In my case, this probably would fix the issue for most of the users of indexed_bzip2 because it probably mostly is used as a library instead of the command line interface. However, I see that it might be a major refactor. Would a pull request for something like this be welcome? (If there is no other solution).