JuliaCI / julia-buildbot

Buildbot configuration for build.julialang.org
MIT License
19 stars 14 forks source link

Windows builders need to use opensuse toolchain, one way or another #20

Closed tkelman closed 9 years ago

tkelman commented 9 years ago

There was an ABI breakage in mingw-w64's libstdc++ on opensuse, so winrpm packages aren't compatible with cygwin-built Julia right now. Would this be much work to set up? Here's the build recipe, I'm testing it in a docker container of opensuse 13.1 right now and I suspect a vagrant box would be equivalent:

# Change the following to i686-w64-mingw32 for 32 bit Julia:
export XC_HOST=x86_64-w64-mingw32
# Change the following to 32 for 32 bit Julia:
export BITS=64

zypper addrepo http://download.opensuse.org/repositories/windows:mingw:win$BITS/openSUSE_13.1/windows:mingw:win$BITS.repo
zypper --gpg-auto-import-keys refresh
zypper -n install --no-recommends git make cmake tar wine which curl python python-xml patch gcc-c++ m4 p7zip.i586 libxml2-tools
zypper -n install mingw$BITS-cross-gcc-c++ mingw$BITS-cross-gcc-fortran mingw$BITS-libstdc++6 mingw$BITS-libgfortran3 mingw$BITS-libssp0
# opensuse packages the mingw runtime dlls under sys-root/mingw/bin, not /usr/lib64/gcc
cp /usr/$XC_HOST/sys-root/mingw/bin/*.dll /usr/lib*/gcc/$XC_HOST/*/
git clone git://github.com/JuliaLang/julia.git julia
cd julia
make -j4 win-extras binary-dist
tkelman commented 9 years ago

(don't trust the version in your inbox, I'm continuing to fiddle with it)

tkelman commented 9 years ago

@staticfloat you may not have gotten a notification on this. We'll need to adjust the Windows buildbots somehow before we tag the next release, unless we get lucky and a cygwin rebuild of the mingw-w64 compilers (I asked in https://cygwin.com/ml/cygwin/2015-06/msg00254.html but so far hasn't happened yet) happens to restore ABI compatibility with opensuse/winrpm packages.

Neither Docker nor Vagrant fully work for the cross-compile with opensuse right now, I think wine might be especially buggy under container/virtualized environments. So it might be pushing our luck. An alternative I was able to get to work is going to MSYS2 instead of cygwin, and downloading the toolchain from opensuse. I have this in a branch at https://github.com/JuliaLang/julia/pull/11705 that I'll merge soon.

Here's a powershell provisioning script I was able to get to work inside a clean vagrant windows-guest VM:

# change the following to 32 for 32 bit Julia
$bits = "64"
# change the following to i686 for 32 bit Julia
$arch = "x86_64"
# change the date in the following for future msys2 releases
$msys2tarball = "msys2-base-$arch-20150512.tar"

# these environment variables need to be set, or Julia gets confused
[Environment]::SetEnvironmentVariable("HOMEDRIVE", "C:")
[Environment]::SetEnvironmentVariable("HOMEPATH", "\Users\vagrant")

# install chocolatey, cmake, and python2
iex ((new-object net.webclient).DownloadString("https://chocolatey.org/install.ps1"))
choco install -y cmake
choco install -y python2

# pacman is picky, reinstall msys2 from scratch
foreach ($dir in @("etc", "usr", "var")) {
  if (Test-Path "C:\msys$bits\$dir") {
    rm -Recurse -Force C:\msys$bits\$dir
  }
}
mkdir -Force C:\msys$bits | Out-Null
(new-object net.webclient).DownloadFile(
  "https://chocolatey.org/7za.exe",
  "C:\msys$bits\7za.exe")
(new-object net.webclient).DownloadFile(
  "http://sourceforge.net/projects/msys2/files/Base/$arch/$msys2tarball.xz",
  "C:\msys$bits\$msys2tarball.xz")
cd C:\
& "msys$bits\7za.exe" x -y msys$bits\$msys2tarball.xz
& "msys$bits\7za.exe" x -y $msys2tarball | Out-Null
rm $msys2tarball, msys$bits\$msys2tarball.xz, msys$bits\7za.exe

& "C:\msys$bits\usr\bin\sh" -lc "pacman --noconfirm --force --needed -Sy \
  bash pacman pacman-mirrors msys2-runtime"
& "C:\msys$bits\usr\bin\sh" -lc "pacman --noconfirm -Syu && \
  pacman --noconfirm -S diffutils git m4 make patch tar p7zip msys/openssh"
& "C:\msys$bits\usr\bin\sh" -lc "if ! [ -e julia$bits ]; then
  git clone git://github.com/JuliaLang/julia.git julia$bits; fi && cd julia$bits && git pull && \
  if ! [ -e usr/$arch-w64-mingw32 ]; then contrib/windows/get_toolchain.sh $bits; fi && \
  export PATH=`$PWD/usr/$arch-w64-mingw32/sys-root/mingw/bin:`$PATH:/c/tools/python2 && \
  echo 'override CMAKE=/c/Program\ Files\ \(x86\)/CMake/bin/cmake' > Make.user && \
  make cleanall && make -j2 test && make win-extras binary-dist"

Non-interactive installation of msys2 is a little messier than I would've liked, but oh well. Their installer exe was built using Qt and isn't easy to run in a scripted way. But at least they have a tarball backup option - we need to grab 7zip temporarily to extract a .tar.xz though (chocolatey ends up doing the same thing, so I'm borrowing their url too).

staticfloat commented 9 years ago

@tkelman I've been chipping away at better support for Windows buildbots recently as well. Are there any downsides to using msys2 over Cygwin? Should we try switching completely and see what happens? I just finished writing a complete Cygwin provisioning script in powershell, shouldn't be too much work to add an msys builder and see what happens.

tkelman commented 9 years ago

Wonderful. Please let me know how I can help, or where your provisioning scripts live.

MSYS2 vs Cygwin doesn't matter all that much, the biggest difference is in the path-mangling msys does to make calling native executables from the posix shell work better. In Cygwin you have to manually wrap arguments with cygpath -w, or use cross-compilers that are themselves Cygwin-hosted, MinGW-target so the compiler executable understands the posix paths.

The critical difference is in the toolchain. The previously-recommended toolchain that we had in README.windows.md for MSYS2 is not compatible with WinRPM packages, so it's a no-go for building binaries. The toolchain that you can get from pacman in MSYS2 (which is not what we had been recommending) uses the wrong exception-handling model for 32 bit. 64 bit might work, I haven't tried, but it's more convenient to use the same basic steps for both arches.

The new toolchain from opensuse is virtually guaranteed to be compatible with WinRPM packages as long as we keep it updated regularly, but the options are either cross-compile from an opensuse host or use native Windows-host versions of the compilers which don't understand posix paths. I did a little bit of work to try to make it possible to use the opensuse native toolchain from within Cygwin, but CMake was fighting me and I'm giving up on that for the time being.

staticfloat commented 9 years ago

I see. So if we wanted to build a nightly that would "just work" on 32-bit and 64-bit right now, what can we do? It sounds like the posix path problem stops us from being able to use the opensuse compilers?

tkelman commented 9 years ago

MSYS2 would fix the path problem. I think the immediate step forward is to proceed with using MSYS2 (with the opensuse toolchain, and keep it regularly updated) on a buildbot and hack at it till it works. Should've put that as a TL;DR

staticfloat commented 9 years ago

Cool, sounds good. I'm going to adapt your powershell above and try to get the new provisioning stuff uploaded somewhere soon. It's all ansible-centric, with a little bit of powershell mixed in. I was pretty unhappy with how brittle the packer/vagrant stuff was, and since the cloud services I'm using (e.g. MIT's OpenStack cluster) already come with base images for most of what we use (Ubuntu, Windows and Centos), it made a lot more sense to make an ansible role to tweak those images into what we want rather than try to make all those images from scratch.

tkelman commented 9 years ago

I trust your judgement on the VM image handling side. Will that be at https://github.com/staticfloat/ansible-julia_buildslave ?

If easy to do, a buildbot-triggerable job to run the "install/update packages, toolchain, etc" scripts would be useful.

tkelman commented 9 years ago

Oh. Right. One huge downside of MSYS2 that I just remembered is that they host their binaries on Sourceforge, with all the unreliability that implies. Maybe they'll move to bintray like homebrew did.

tkelman commented 9 years ago

And grr, msys2's python isn't working any more for building the docs, virtualenv complains. Getting an msvc-built python from chocolatey might work, I'm trying that now.

staticfloat commented 9 years ago

@tkelman How nicely do msys2 and cygwin play with eachother? If I want to run an SSH server, can I install Cygwin's SSH server, but then use msys2's toolchain to compile?

tkelman commented 9 years ago

I believe so. I think that's how my vagrant box was working, it was going through chef which embeds cygwin for the purposes of ssh. Would the powershell provisioning script be able to avoid going through the cygwin ssh server via winrm or rdp or something like that? You'd need to make sure that the make commands are running inside an msys2 shell, which [/cygdrive]/c/msys2/usr/bin/sh -lc "make" would hopefully do.

staticfloat commented 9 years ago

I don't have a really great way of remote controlling the windows boxes without SSH. I tried WinRM, but was unable to get anything to work properly. I figure, if it's just a powershell script I need to run once to install SSH and friends, that's not too much of a pain, since afterwards things can be done over SSH (including installing further packages via cygwin/pacman)

tkelman commented 9 years ago

Do whatever's easiest, just want to make sure that the build runs inside an msys2 shell otherwise cmake isn't going to cooperate. I hope cmake inside an msys2 shell inside a cygwin ssh connection would work, but there's one sure-fire way to find out.

tkelman commented 9 years ago

As a backup if it doesn't work, msys2 should have its own openssh package that could be usable to remove one layer.

staticfloat commented 9 years ago

I saw that, but there's a bunch of yuckiness to get the server to run like a service. I'm planning on attempting this tonight, we'll see how it goes. :)

staticfloat commented 9 years ago

Alright, I've gotten msys2 ssh to work. I may be running into msys2 python issues however, I've got an error that I really don't want to have to debug while trying to install twisted. :P

gcc -fno-strict-aliasing -march=i686 -mtune=generic -O2 -pipe -DNDEBUG -DNDEBUG -march=i686 -mtune=generic -O2 -pipe -DNDEBUG -I/usr/include/python2.7 -c conftest.c -o conftest.o
         16 [main] python 3840 child_info_fork::abort: address space needed by 'operator.dll' (0x30000) is already occupied
    error: [Errno 11] Resource temporarily unavailable
tkelman commented 9 years ago

I found over in https://github.com/JuliaLang/julia/issues/11705 that msys2's python didn't work properly in virtualenv, so changed the recommendation to conventional Windows Python from python.org installers. choco install -y python2 is a nice concise way of installing it.

The fork failure might have to do with needing a rebase (especially if this was the 32 bit build), I don't know whether pacman has really integrated that automatically yet the way cygwin's setup does.

staticfloat commented 9 years ago

Ah, I read something about that. I'll try it.

tkelman commented 9 years ago

FYI - there was a cygwin mingw-w64 package update, I'll be trying that to see if it works.

staticfloat commented 9 years ago

Yes, please do. I've been working on the msys angle, but little problems have kept on cropping up again and again. I never thought I'd say this, but Cygwin is actually pretty convenient. :P

tkelman commented 9 years ago

Just got things built, fiddling with testing winrpm now. Either way it shouldn't hurt to do a package upgrade on the existing cygwin buildbots.

tkelman commented 9 years ago

Looks like the verdict is, using latest cygwin's copies of gcc dll's, as current julia master will do, will fail to load winrpm dll's - but I suspect this is just due to opensuse using gcc 5.1, cygwin using gcc 4.9.2. If I reinstate https://github.com/JuliaLang/julia/commit/997c8d8f35f01d40908e1c49c10b150095027cdb (or the equivalent version in winrpm but that would cause existing windows installs to start segfaulting on startup), distributing opensuse's gcc dll's in the cygwin-built binaries, then things appear to work for now.

tkelman commented 9 years ago

I'll be on a plane to boston tomorrow, but if you get the cygwin builders' packages updated, feel free to revert https://github.com/JuliaLang/julia/commit/997c8d8f35f01d40908e1c49c10b150095027cdb, nuke out the old built binaries, and we'll see what happens.

tkelman commented 9 years ago

Oh, there will be a conflict there - in

++<<<<<<< HEAD
 +      "mingw32-libexpat1 mingw32-zlib1" && \
 +      cp usr/i686-w64-mingw32/sys-root/mingw/bin/*.dll .
++=======
+       "mingw32-libgfortran3 mingw32-libquadmath0 mingw32-libstdc++6 mingw32-libgcc_s_sjlj1 mingw32-libssp0 mingw32-libexpat1 mingw32-zlib1"
++>>>>>>> parent of 997c8d8... Partially revert 0a629d5b1dcc63b4798eeccba1c5a29e997061ed

you can replace the first line with the longer version, but make sure to add && \ at the end of the line, and keep the cp from the newer version. Same a few lines lower with x86_64 and mingw64. I'd do it myself but want to be sure the builders are using package version 4.9.2-2 first.

tkelman commented 9 years ago

Please let me know as soon as the cygwin builders are updated to use version 4.9.2-2 of the mingw-w64 packages. This is urgent enough that I want to tag 0.3.10 as soon as this is fixed, so people can install WinRPM packages again.

staticfloat commented 9 years ago

Updated to 4.9.2-3.

tkelman commented 9 years ago

Thanks! (There's a -3 already? checks email for cygwin mailing list - oh)

Apparently nuking the builder broke the ca certs somehow?

staticfloat commented 9 years ago

Yeah, I think something went sideways during cygwin update. I reinstalled the ca-certs package and they seem to be happier now. Now I just have to figure out why every buildbot is failing on make install due to sys.ji missing....

tkelman commented 9 years ago

Just the windows buildbots, or all of them? Probably broken by https://github.com/JuliaLang/julia/pull/11640

staticfloat commented 9 years ago

All of them. And yes, that definitely looks like it, thanks! -E

On Tue, Jun 23, 2015 at 10:03 AM, Tony Kelman notifications@github.com wrote:

Just the windows buildbots, or all of them? Probably broken by JuliaLang/julia#11640 https://github.com/JuliaLang/julia/pull/11640

— Reply to this email directly or view it on GitHub https://github.com/staticfloat/julia-buildbot/issues/20#issuecomment-114573252 .