skeeto / w64devkit

Portable C and C++ Development Kit for x64 (and x86) Windows
The Unlicense
2.95k stars 205 forks source link

build mingw-w64-crt with -march=i486? #165

Open FunkyFr3sh opened 2 weeks ago

FunkyFr3sh commented 2 weeks ago

Do we really need crt1.o/crt1u.o/crt2.o/crt2u.o/dllcrt1.o/dllcrt2.o to be built with the default -march=pentium4? There shouldn't be anything performance-critical in there, or am I wrong?

Using -nostartfiles and -nostdlib could be used as a workaround, but it would be nice to be able to target -march=i486 without those switches as well.

skeeto commented 2 weeks ago

As far as I know GCC has no fine granularity to say, "build this part of the runtime with march=A and this other part with march=B". Even if it did there's no enough control when building applications to say "use this part of the runtime but not this other part." Some runtime (libgcc, etc.) may be required in all builds, even -nostdlib, etc., which is why I provide libmemory and libchkstk. It's an all-or-nothing proposition.

As you've likely seen me write elsewhere, SSE2 performance improvements are often night and day, especially libstdc++ and libquadmath (gfortran). That's innate in SSE2, but also that GCC does a better job generating code when it has access to SSE2. Since SSE2 has been common in x86 hardware for the past 20 years, I feel it's worth the trade-off. Supporting Windows XP is already niche, and pre-SSE2 XP is a niche of a niche. (Even without SSE2, the runtime doesn't support earlier than XP.)

If you fall into this niche of a niche, lucky for you that w64dk is the most transparent compiler toolchain available for Windows! It's trivial to bootstrap a plain 486 toolchain from nothing more than Docker: remove the two "--with-arch=pentium4" hunks from src/variant-x86.patch. Or start from an x86 release (releases are recursive and can bootstrap a new compiler in place, given Docker) and remove those lines from Dockerfile.

FunkyFr3sh commented 2 weeks ago

Yeah I was only referring to C code, I think crt1.o/crt1u.o/crt2.o/crt2u.o/dllcrt1.o/dllcrt2.o are the only objects linked in that case. I wouldn't want libgcc without SSE either, I know it's huge improvement in performance.

I do have a custom build already with only crt1.o/crt1u.o/crt2.o/crt2u.o/dllcrt1.o/dllcrt2.o built with -march=i486, and everything else on default (-march=pentium4). I thought I suggest it here since it's kinda useful in some cases and doesn'T seem to have a real disadvantage.

Edit: Only this here with "-march=i486" - Everything else stays on default https://github.com/skeeto/w64devkit/blob/master/Dockerfile#L236