niXman / mingw-builds

Scripts for building the 32 and 64-bit MinGW-W64 compilers for Windows
Other
290 stars 107 forks source link

Building a toolchain based on UCRT #580

Closed chris-se closed 2 years ago

chris-se commented 2 years ago

I've recently been playing around with building a toolchain that uses Microsoft's "new" universal C runtime (UCRT) instead of the default toolchain that uses MSVCRT from VC6.

I've stumbled upon various issues that didn't work out of the box that also people beside me found.

Primary Concern

The "starting point toolchain" for other builds is currently a precompiled version of GCC8 that was built against MSVCRT. It will generate code that needs to be linked against MSVCRT. This is an issue because using a different C runtime means we're effectively in a cross-compilation situation, but GCC's build system doesn't see it as such, because there's no separate architecture specifier for this, everything is just x86_64-w64-mingw32 on 64bit.

This means that all of the pre-dependencies of GCC are compiled against MSVCRT instead of the desired UCRT. The pre-dependencies include binutils (including the linker, ld), but also the libraries that GCC itself uses (GMP, MPFR, MPC, ISL).

It is NOT possible to mix code compiled against MSVCRT with code compiled against UCRT except in very special circumstances. The following are especially problematic:

What IS possible is linking against DLLs that were created for a different C runtime, provided that the interface of the DLLs is only plain C (no C++ etc.) and no C runtime specific data-structures are passed through that interface.

This all leads to the following issues when attempting to simply pass --with-default-msvcrt=ucrt to the build script:

  1. Since the dependencies of GCC (GMP, ISL, etc.) were all compiled with a compiler that produces binaries to be linked with MSVCRT, but the second and third stages of GCC were then built with a compiler (the respective previous stage of GCC) that produces binaries to be linked with UCRT, and attempts to link against UCRT, this will fail with linker errors. See issue #578 for what happens (I had the same error message).

    There are two ways to work around this issue:

    a. Don't actually bootstrap GCC, so there's only ever one stage present that was compiled with the initial MSVCRT-using toolchain, so that while GCC itself might still require MSVCRT, it will produce binaries that use UCRT. (--no-bootstrap)

    b. Compile GMP, ISL, etc. into dynamic DLLs (--dyn-deps) so that they are never statically linked into GCC.

  2. LTO doesn't work when binutils was built against MSVCRT, but GCC's additional stages were built against UCRT. I haven't tried that yet, but not bootstrapping GCC (--no-bootstrap) will probably work in this case. Instead I've disabled LTO for the GCC build to get a working compiler. (See below.)

  3. When Fortran is enabled building GCC also doesn't work, but I haven't diagnosed this further.

    Workaround: disable Fortran by selecting only C and C++.

Additional Issues

In addition to all of that I've found an issue with ncurses: during 'make install' it tries to call the POSIX access() function with X_OK, which will always fail on Windows when using the version provided by Microsoft's own UCRT. MinGW provides some emulation capability for this by allowing the specification of -D__USE_MINGW_ACCESS as a compiler option to avoid this issue.

Unrelated to all of this was the fact that Expat decided to rename their tarballs for 2.4.1 due to a security issue - but upgrading to 2.4.7 was trivial enough.

Path to a Working Toolchain

With all of this I was able to generate a C/C++ only toolchain without LTO that itself may still have used MSVCRT in places, but would generate UCRT packages. Note that tests/time_test.c would fail in line 145 (assertion in 146), so the test suite wouldn't complete, but the compiler itself was useable.

However this is not really perfect at this point: with this kind of toolchain there are three big issues:

  1. No LTO
  2. No Fortran (hence no LAPACK!)
  3. The failure of tests/time_test.c worries me a bit

Aditionally, GCC wasn't properly bootstrapped in this instance, and having this weird mix of MSVCRT and UCRT binaries in the toolchain didn't sit well with me.

For this reason I decided that the toolchain I created here should only be considered an intermediate step, and then I decided to run the build script again to create a proper UCRT-only toolchain with that.

I added an option to the build script to allow the user to specify a custom "initial" toolchain (instead of downloading one). I then built a new toolchain while using the intermediate one (that was able to generate UCRT binaries).

With that I did have success, and now have a toolchain that is linked against UCRT, generates binaries that linked against UCRT, supports C, C++, Fortran, and LTO, and the mingw-builds test suite succeeds completely.

Results, Pull Request

My workflow now is (for now):

  1. Build the intermediate toolchain (G:\mingw-build is where I store the output, G:\Sources\mingw-builds is my repository):

    ./build --mode=gcc-11.2.0 --arch=x86_64 --buildroot=/g/mingw-build/temp.ucrt64 --exceptions=seh --rt-version=v9 --threads=posix --enable-languages=c,c++ --with-default-msvcrt=ucrt --dyn-deps --no-gcc-lto  --no-bootstrap
    ./build --mode=gcc-11.2.0 --arch=i686 --no-multilib --buildroot=/g/mingw-build/temp.ucrt32 --exceptions=sjlj --rt-version=v9 --threads=posix --enable-languages=c,c++ --with-default-msvcrt=ucrt --dyn-deps --no-gcc-lto --no-bootstrap

    Both intermediate toolchains will fail in the tests, but realistically the build can already be canceled after binutils, gcc and make are done, because gdb (+ all dependencies thereof) aren't needed to then build the next chain, so I can cancel the build with CTR+C at that point.

  2. Build the actual toolchains that utilize the previously built toolchains as the new starting point:

    ./build --mode=gcc-11.2.0 --arch=x86_64 --buildroot=/g/mingw-build/gcc-11.2.0-x86_64-ucrt --exceptions=seh --rt-version=v9 --threads=posix --enable-languages=c,c++,fortran --with-default-msvcrt=ucrt  --provided-toolchain=/g/mingw-build/temp.ucrt64/x86_64-1120-posix-seh-rt_v9

    (Analogous for 32bit, just don't forget --no-multilib when using sjlj exceptions.)

I've created a pull request, #581, for the changes I made. (Edited from original to mention the pull request number.) As an overview, these are the changes required to get this to work:

Either one of these changes can of course be cherry-picked.

Outlook

In the long term I think the best option would probably be to use --bootstrapall. I haven't tried that yet, but from reading the source code of the build scripts it doesn't appear to do quite what would be required for this.

starg2 commented 2 years ago

I think the failure of tests/time_test.c is caused by mixing MSVCRT and UCRT.

clock_settime is provided by winpthread. When you build the intermediate toolchain, winpthread gets linked against MSVCRT. This means that when clock_settime sets errno, it sets the one in MSVCRT. Then time_test.exe checks UCRT's errno and causes assertion failure.

chris-se commented 2 years ago

That actually makes a lot of sense. (I didn't know clock_settime was provided by winpthread, I'm mostly a user of MinGW, not an expert on its internals.) But since I'm not reusing the semi-working toolchain, but only using it to create the real one (where all unit tests pass) I think this is probably an acceptable thing.

kov-serg commented 2 years ago

Is any sence in uing UCRT?

niXman commented 2 years ago

@chris-se could you please close this issue when the job (https://github.com/niXman/mingw-builds/actions/runs/2115280298) will completes successfully?

niXman commented 2 years ago

Is any sence in uing UCRT?

I join this question. could someone explain the meaning of using UCRT, or just provide a link to an explanation? many thanks!

chris-se commented 2 years ago

MSVCRT, the default underlying C runtime used by MinGW, is from a very old Visual Studio version. And while it is now present by default in all recent Windows versions, it is basically legacy code.

UCRT is now an operating system component in Windows (similar to libc on POSIX), is more conforming to the official C standard, and is also used by all Visual Studio versions since 2015. (Note that UCRT is NOT the standard C++ library, that is still Visual Studio version dependent. And newer Visual Studio versions may by default still require a runtime in addition to UCRT.)

When using UCRT for MinGW it is now in principle possible to statically link code that was compiled with any Visual Studio version after 2015 with code generated by MinGW. This doesn't always work (especially with C++ invovled), but in general it helps a ton with interoperability regarding code compiled with Visual Studio.

See also:

(I'll close this issue as soon as the job completes. Many thanks for merging. I'm still looking into doing this automatically via --boostrapall, but that may take a while.)

niXman commented 2 years ago

@chris-se I understand, thanks for the clarification! maybe it makes sense to use UCRT by default? what could be the problem in this case?

viccpp commented 2 years ago

The UCRT is now a Windows component, and ships as part of Windows 10 and later.

What about earlier versions? BTW, is WinXP still supported by MinGW?

kov-serg commented 2 years ago

Yes. MinGW supports windows xp: 10.1, 11.2 But UCRT may not work on windows 7 оr 8, and even may not work on older windows 10 versions Typical situation look like this: ucrt Declaring the noble goal of switching runtime from wide and multibyte to utf8. UCRT remains a tool for planned software obsolescence. ps: visual studio can compile with out any ucrt and msvcrt dependencies, and resulting code can run on winxp. But you have to change this manually. By default it will not. So even php now will not run on winxp. msys2 drops support of win7. python and many other.

chris-se commented 2 years ago

There are redistributables from Microsoft for UCRT at least for Windows XP and later, and if your software still wants to support Windows versions older than 10, you'll need to also package it together with the corresponding redistributable. This is explained in the Deployment and redistribution of the Universal CRT section of one of the aforementioned links I posted. I'm not sure that UCRT is updated automatically on Windows versions before 10 (like it is in 10 and 11). However, it may be the case that Microsoft will not continue to update the redistributables for Windows versions before 10, I don't know. I do know that with the redistributables I at least could run some small test programs on Windows 7. (Haven't put any real effort into that though.)

The error message provided by @kov-serg indicates that either UCRT as a whole or a specific component of it could not be found. Again, that problem would go away in combination with the redistributables if older versions of Windows should be supported by the software. Additionally, the options in VIsual Studio to compile without any explicit DLL dependencies will result in the C/C++ runtimes (whichever ones are selected) to be linked statically into the code. I don't believe that this is allowed license-wise when not using Visual Studio (but I haven't checked, to be fair).

@niXman Regarding making it default: I'm currently upgrading a relatively large codebase to use a newer compiler (switching to GCC 11), and at the same time I wanted to switch to UCRT for better interoperability with code compiled from Visual Studio. I haven't completed the transition yet, and I don't know if there are going to be any additional issues. One issue I did stumble upon was the access() function in UCRT always denying X_OK (see the change for ncurses in the pull request -- I believe that MinGW always provides its own access() function when linking against MSVCRT, which is why this error didn't occur there), and there might be other issues that I don't yet know about. Combine that with the fact that code compiled with previous MinGW versions against MSVCRT will be incompatible with code compiled against UCRT (when statically linking), I think making UCRT the default is premature, because a lot more people will need to have tested this. I do believe that would probably be the right long-term move. On the other hand, providing UCRT variant prebuilt toolchains in addition to the MSVCRT toolchains would be a very nice idea. For that to be sensibly automate-able via CI I don't believe the current way I'm doing this in two steps is the right way to go forward. I do still plan to take a look at getting --bootstrapall to work properly for this use case, and then we could probably add UCRT variants to the builds in addition to the MSVCRT variants.

kov-serg commented 2 years ago

There are redistributables from Microsoft for UCRT at least for Windows XP

There is not. Just point me there I can get it. Or just find file api-ms-win-core-path-l1-1-0.dll There is not such file in windows 10 and windows 11 but application works.

chris-se commented 2 years ago

You'll have to use the redistributables for Visual C++ 2015 or higher from Microsoft.

The current version from the official website includes support for Visual C++ 2015 through 2022, and is available in both 32bit and 64bit versions. (Links are direct .exe download links, taken from this website.) These don't work with Windows XP though (since support for XP was discontinued by Microsoft 8 years ago), but there are download links to older versions of the Visual C++ 2015 redistributables from Microsoft that should include UCRT components and do officially support Windows XP. I definitely know that the Visual C++ 2017 redistributables did include UCRT components, but I don't have a download link for these.

By the way, api-ms-win-core-path-l1-1-0.dll is not a real DLL, it's a virtual interface that will be mapped by Windows to the UCRT implementation, you'll never find a DLL with that name on any Windows system.

kov-serg commented 2 years ago

By the way, api-ms-win-core-path-l1-1-0.dll is not a real DLL, it's a virtual interface that will be mapped by Windows to the UCRT implementation, you'll never find a DLL with that name on any Windows system.

That's why application will not start even with installed redistributables. That why people have to write workarounds

ps: Image with error is from windows 7 with installed vs2019

niXman commented 2 years ago

@chris-se

Combine that with the fact that code compiled with previous MinGW versions against MSVCRT will be incompatible with code compiled against UCRT (when statically linking), I think making UCRT the default is premature, because a lot more people will need to have tested this.

understand.

For that to be sensibly automate-able via CI I don't believe the current way I'm doing this in two steps is the right way to go forward. I do still plan to take a look at getting --bootstrapall to work properly for this use case, and then we could probably add UCRT variants to the builds in addition to the MSVCRT variants.

it would be great! thank you for your contribution to the project!