protocolbuffers / protobuf

Protocol Buffers - Google's data interchange format
http://protobuf.dev
Other
65.25k stars 15.45k forks source link

BUG: `CommandLineInterfaceTest` failing with "SEH exception" on windows with v25.2 #15436

Closed h-vetinari closed 8 months ago

h-vetinari commented 8 months ago

In conda-forge, all 151 tests in CommandLineInterfaceTest fail with the following error:

[ RUN      ] CommandLineInterfaceTest.BasicOutput
unknown file: error: SEH exception with code 0xc0000005 thrown in the test fixture's constructor.
Stack trace:

[  FAILED  ] CommandLineInterfaceTest.BasicOutput (1 ms)

Looking for "SEH exception ...", this seems to be coming from windows directly, and is possibly related due to having some kind of ABI mismatch (alignment; MD vs MT etc.).

Given that background, it's probably worth mentioning that we're building a shared version of libprotobuf also on windows - it's conceivable that something about the CLI is always built against a static runtime?

In any case, this setup used to work without issues up until v24.4.

More context in this CI run.

What version of protobuf and what language are you using? Version: v25.2 Language: C++

What operating system (Linux, Windows, ...) and version?

Windows (GH image windows-2022)

What runtime / compiler are you using (e.g., python version or gcc version)

VS2019

Build environment ``` bzip2: 1.0.8-hcfcfb64_5 conda-forge ca-certificates: 2023.11.17-h56e8100_0 conda-forge cmake: 3.28.1-hf0feee3_0 conda-forge krb5: 1.21.2-heb0366b_0 conda-forge libcurl: 8.5.0-hd5e4a3a_0 conda-forge libexpat: 2.5.0-h63175ca_1 conda-forge libssh2: 1.11.0-h7dfc565_0 conda-forge libuv: 1.44.2-hcfcfb64_1 conda-forge libzlib: 1.2.13-hcfcfb64_5 conda-forge ninja: 1.11.1-h91493d7_0 conda-forge openssl: 3.2.0-hcfcfb64_1 conda-forge ucrt: 10.0.22621.0-h57928b3_0 conda-forge vc: 14.3-hcf57466_18 conda-forge vc14_runtime: 14.38.33130-h82b7239_18 conda-forge vs2015_runtime: 14.38.33130-hcb4865c_18 conda-forge vs2019_win-64: 19.29.30139-he1865b1_18 conda-forge vswhere: 3.1.4-h57928b3_0 conda-forge xz: 5.2.6-h8d14728_0 conda-forge zstd: 1.5.5-h12be248_0 conda-forge ```
Host environment ``` gtest: 1.14.0-h91493d7_1 conda-forge jsoncpp: 1.9.5-h2d74725_1 conda-forge libabseil: 20230802.1-cxx17_h63175ca_0 conda-forge libabseil-tests: 20230802.1-cxx17_h63175ca_0 conda-forge libzlib: 1.2.13-hcfcfb64_5 conda-forge ucrt: 10.0.22621.0-h57928b3_0 conda-forge vc: 14.3-hcf57466_18 conda-forge vc14_runtime: 14.38.33130-h82b7239_18 conda-forge vs2015_runtime: 14.38.33130-hcb4865c_18 conda-forge zlib: 1.2.13-hcfcfb64_5 conda-forge ```

What did you do?

mkdir build
cd build
cmake -G "Ninja" ^
    -DCMAKE_BUILD_TYPE=Release ^
    -DCMAKE_CXX_STANDARD=17 ^
    -DCMAKE_INSTALL_PREFIX=%LIBRARY_PREFIX% ^
    -DCMAKE_PREFIX_PATH=%LIBRARY_PREFIX% ^
    -Dprotobuf_ABSL_PROVIDER="package" ^
    -Dprotobuf_BUILD_SHARED_LIBS=ON ^
    -Dprotobuf_JSONCPP_PROVIDER="package" ^
    -Dprotobuf_MSVC_STATIC_RUNTIME=OFF ^
    -Dprotobuf_USE_EXTERNAL_GTEST=ON ^
    -Dprotobuf_WITH_ZLIB=ON ^
    ..
if %ERRORLEVEL% neq 0 exit 1

cmake --build .
if %ERRORLEVEL% neq 0 exit 1

ctest --progress --output-on-failure
if %ERRORLEVEL% neq 0 exit 1

What did you expect to see

Passing test suite

What did you see instead?

See above

Anything else we should know about your project / environment

h-vetinari commented 8 months ago

Looking at the blame of the respective file (and the logs noting something about the fixture constructor), my first suspicion was that this is related to

https://github.com/protocolbuffers/protobuf/blob/a9b006bddd52e289029f16aa77b77e8e0033d9ee/src/google/protobuf/compiler/command_line_interface_unittest.cc#L198-L204

_putenv on windows requires <stdlib.h> (though that's probably satisfied by some transitive include), but more importantly, it's explicitly not threadsafe. The docs note:

The _putenv and _getenv families of functions are not thread-safe. _getenv could return a string pointer while _putenv is modifying the string, causing random failures. Make sure that calls to these functions are synchronized.

However, even with including that header and running the test suite in parallel, the failures persist. They also persist if we build the static variant of protobuf.

h-vetinari commented 8 months ago

Couple of other things I noticed:

h-vetinari commented 8 months ago

I guess it's also possible that something could be happening related to the following hunk from e897bcf3ff8004b7446bbe43a1fc28f9a92a11d4

@@ -242,9 +252,13 @@ class CommandLineInterfaceTest::NullCodeGenerator : public CodeGenerator {
 // ===================================================================

 CommandLineInterfaceTest::CommandLineInterfaceTest() {
+  // Reset the mock generator's test case environment variable.
+  SetMockGeneratorTestCase("");
+
   // Register generators.
-  RegisterGenerator("--test_out", "--test_opt",
-                    std::make_unique<MockCodeGenerator>("test_generator"),
+  auto mock_generator = std::make_unique<MockCodeGenerator>("test_generator");
+  mock_generator_ = mock_generator.get();
+  RegisterGenerator("--test_out", "--test_opt", std::move(mock_generator),
                     "Test output.");
   RegisterGenerator("-t", std::make_unique<MockCodeGenerator>("test_generator"),
                     "Test output.");

It looks strange to me that we're holding a persistent reference in mock_generator_ to something that's then moved (this is also not done consistently between --test_out and -t, which should otherwise be the same AFAICT).

h-vetinari commented 8 months ago

There's also a couple of leaks around mocked objects which happen in other tests; not sure if that's an unrelated issue or somehow connected:

D:\bld\libprotobuf-split_1705373215324\work\src\google\protobuf\io\coded_stream_unittest.cc(1494): ERROR: this mock object (used in test CodedStreamTest.InputOver2G) should be deleted but never is. Its address is @000000B6EBEFE170.
D:\bld\libprotobuf-split_1705373215324\work\src\google\protobuf\text_format_unittest.cc(2107): ERROR: this mock object (used in test TextFormatParserTest.PrintErrorsToStderr) should be deleted but never is. Its address is @000000B6EBEFE2A0.
D:\bld\libprotobuf-split_1705373215324\work\src\google/protobuf/message_unittest.inc(225): ERROR: this mock object (used in test MessageTest.ParseFailsIfSubmessageNotInitialized) should be deleted but never is. Its address is @000000B6EBEFE4E0.
D:\bld\libprotobuf-split_1705373215324\work\src\google/protobuf/message_unittest.inc(206): ERROR: this mock object (used in test MessageTest.ParseFailsIfNotInitialized) should be deleted but never is. Its address is @000000B6EBEFE520.
D:\bld\libprotobuf-split_1705373215324\work\src\google\protobuf\io\coded_stream_unittest.cc(1365): ERROR: this mock object (used in test CodedStreamTest.TotalBytesLimit) should be deleted but never is. Its address is @000000B6EBEFE540.
D:\bld\libprotobuf-split_1705373215324\work\src\google/protobuf/message_unittest.inc(249): ERROR: this mock object (used in test MessageTest.ParseFailsIfExtensionNotInitialized) should be deleted but never is. Its address is @000000B6EBEFE570.
D:\bld\libprotobuf-split_1705373215324\work\src\google\protobuf\descriptor_unittest.cc(6188): ERROR: this mock object (used in test ValidationErrorTest.ErrorsReportedToLogError) should be deleted but never is. Its address is @000000B6EBEFE5B0.
D:\bld\libprotobuf-split_1705373215324\work\src\google\protobuf\descriptor_unittest.cc(11238): ERROR: this mock object (used in test DatabaseBackedPoolTest.ErrorWithoutErrorCollector) should be deleted but never is. Its address is @000000B6EBEFE5D0.
ERROR: 8 leaked mock objects found at program exit. Expectations on a mock object are verified when the object is destructed. Leaking a mock means that its expectations aren't verified, which is usually a test bug. If you really intend to leak a mock, you can suppress this error using testing::Mock::AllowLeak(mock_object), or you may use a fake or stub instead of a mock.