Closed stormdragon2976 closed 3 years ago
Next steps on my part:
linux/arm64
.Then, the digests for all Docker images and multi-arch manifest are publicly available in the GitLab CI/CD Pipeline build logs.
Sorry: I am pretty ignorant about docker.
I tried using just the tag originally, and it gave me the image for the wrong arch (amd64 instead of arm). Probably I'm doing something wrong: do I need to tell it explicitly which arch to use in
[EDIT: It seems to be working now. I was probably doing something else wrong before!]docker run
, and if so how?
@benz0li I've built multi-arch Docker images before. Unless the setup has changed, they run on an x86_64 server and build the other architectures using a static QEMU layer to emulate them. The upside is that you can build a multi-arch image for anything QEMU can emulate - arm64, PowerPC, IBM Z series, etc. So they'll work for anything where your base image - Ubuntu or Debian - has the binary toolchain to build pandoc
.
I vaguely recall trying it (for just arm64) on my workstation last fall when I got my new one. And I got it to run on Travis CI but it couldn't complete a few steps within their 50 minute time limit.
@jgm My fault! The manifest for tag 8.10.4
got overwritten while creating tag 9.0.1
for os/arch linux/amd64
. This is fixed with commit 26c291eb.
No, you don't have to tell which arch to use. If the manifest consists of multiple os/arch images, Docker pulls the image for the arch Docker is running on.
I ran out of memory on an 8 GB instance -- not even on one of the memory-hungry modules.
I'm a good way through a build now on a 16 GB t4g.xlarge AWS instance. top
is showing a bit less than 6 GB free, 3-4 GB used, and 6-7 GB buffer/cache. On AMD, I can build on 4 GB, and even on 2 GB with some restarts. Is there an explanation of why we need so much more RAM to compile on ARM? (The 16 GB build machine costs 8 cents / hour more than the 8 GB -- not really a big deal if the build can complete in an hour or two -- but I'm still surprised by the memory needs.)
@jgm I've never tried it on 8 GB - just 4, which fails, and 16, which works. My recollection is that the crashes are in the back end, and that I tried both the gcc and LLVM back ends. I don't remember if I ever got the native GHC back end to work or not, or which version of GHC I used - everything works in 16 GB with the GHC 8.0 that ships in Ubuntu 18.04 so I stopped experimenting.
I can't explain the enormous memory needs while building pandoc on ARM. Tried on a t4g.xlarge (16 GB RAM) myself... and failed. Just started a new build with a 16 GB swapfile in addition.
I used some heavy machinery, t4g.2xlarge (8 vCPUS, 32 GB RAM + 16 GB swapfile), while building pandoc-2.11.4-1-arm64.deb as GHC Docker Images were built in parallel.
Same experience here on t4g.xlarge. Trying now with t4g.2xlarge! It may mean that creating the release package on arm costs $1-2. We can afford that. But it's strange, the difference between ARM and non-ARM. (The time it takes to compile is also much longer.)
Interesting - my AGX Xavier has 16 GB of RAM, Ubuntu18.04 LTS for arm64
, NVIDIA Docker runtime. The stock apt
binaries are GHC 8.0 "unregisterised" - it doesn't have the native back end - and cabal-install
1.24. LLVM is 3.7
All I have to do to build Pandoc is upgrade cabal-install
to 3.0 and do a cabal install pandoc
. It takes overnight to do all that, even with the eight cores, but it does work. IIRC it works with either the GCC back end or the LLVM back end. And I can bootstrap up to GHC 8.2, or upgrade cabal-install
and bootstrap up to GHC 8.4, etc. I haven't been able to get stack
to work, but I don't need it.
My recollection from watching these things go by with top
is that the memory hogs are the back ends, especially the gcc
one, not GHC itself. If you've got connections in the GHC team and want me to capture any trace / debug info on this on the AGX Xavier, have them open an issue on https://github.com/edgyR/edgyR-containers/issues.
One other possibility for Ubuntu: there is a PPA on Launchpad with Haskell backports including Pandoc: https://launchpad.net/~savoury1/+archive/ubuntu/haskell-build/. It does not appear to be set up to build for arm64
but that may be changeable. That would use Ubuntu's build farm, I think.
I've been able to build with the 2xl instance. I'm adding a script to automate this process, and I'll try to include an arm64 binary in the next release. (We can then get statistics to see how much use it gets.)
I have been able to build it with a t4g.xlarge AWS EC2 instance:
registry.gitlab.b-data.ch/ghc/ghc4pandoc:8.10.1
Build time: 5 hours.
The issue with the GitLab CI/CD Pipeline is resolved, too. I am going to rebuild all images and multi-arch manifests for registry.gitlab.b-data.ch/ghc/ghc4pandoc
tomorrow.
I will continue to maintain these images and keep them publicly available. @jgm Let me know if you need a version in-between 8.10.1 and 8.10.4.
The issue with the GitLab CI/CD Pipeline is resolved, too. I am going to rebuild all images and multi-arch manifests for
registry.gitlab.b-data.ch/ghc/ghc4pandoc
tomorrow.I will continue to maintain these images and keep them publicly available. @jgm Let me know if you need a version in-between 8.10.1 and 8.10.4.
Hmm ... maybe I should move my Ubuntu build chain to GitLab and do my R and RStudio builds there on top of your Pandoc build. That would check off a huge box for me; I could go with R from the Ubuntu source repos instead of from upstream. I only need the Jetson base / NVIDIA tools for the last step in my processing.
@jgm pandoc-2.12-linux-arm64.tar.gz seems to be broken:
$ tar -xfz pandoc-2.12-linux-arm64.tar.gz
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
Oh, weird! I've manually unpacked the deb and created a tarball from it, replacing the one on releases. @benz0li can you try again?
I think I can guess what happened. My build script waits until the .tar.gz appears in artifacts, and then downloads it. Perhaps it downloaded it while it was incomplete? IN any case the build script needs some improvement.
Upacking works fine now.
But I'm not sure, if this is the intended structure and content for the arm64 release:
$ tree pandoc-2.12-arm64
pandoc-2.12-arm64
├── bin
│ └── pandoc
└── tmp
└── usr
├── bin
│ └── pandoc
└── share
├── doc
│ └── pandoc
│ └── copyright
└── man
└── man1
└── pandoc.1.gz
9 directories, 4 files
Compared to the amd64 release:
$ tree pandoc-2.12-amd64
pandoc-2.12-amd64
├── bin
│ └── pandoc
└── share
└── man
└── man1
└── pandoc.1.gz
4 directories, 2 files
No, that's not right. Let me fix it manually; I don't want to wait another 3 hours for a build to complete!
@benz0li Good news. I tried tweaking a few settings and now the build takes one hour 15 minutes. What I did was increase heap size for the compiler and (more significantly) told it to use 4 cores instead of 1. (With this build machine we could probably go even higher.)
My experiments didn't gain much from multiple cores. A bunch of small-ish libraries get built at the beginning, but once they're done, it's single-threaded for the large ones.
@jgm More good news.
A linux/arm64
-build on a Mac mini (Apple M1 chip, 16 GB memory) using the tech preview of Docker Desktop for Apple M1 takes just under an hour:
Docker settings used:
ℹ️ Memory increased from 4 GB to 16 GB; Swap increased from 1 GB to 4 GB; CPUs left at 4 👉 the Mac mini M1 has 4 performance cores.
Bearing in mind that Docker Desktop for Apple M1 runs in a virtual machine, which uses the new Virtualization Framework, the performance is outstanding.
Nice, so this produces a native M1 build? (See #6960)
No. Docker Desktop for Apple M1 is effectively linux/arm64
:
Virtualization
Create virtual machines and run Linux-based operating systems.
Overview
The Virtualization framework provides high-level APIs for creating and managing virtual machines on Apple silicon and Intel-based Mac computers. Use this framework to boot and run a Linux-based operating system in a custom environment that you define. The framework supports the Virtio specification, which defines standard interfaces for many device types, including network, socket, serial port, storage, entropy, and memory-balloon devices.
@benz0li Is the Apple operating system shim as light-weight as the Windows version of Docker Desktop, which uses WSL 2?
Aarch64 and Armv7h machines are becoming more and more common place now days. There are Windows, Mac, and Linux binaries, but the Linux binaries do not cover these architectures. Please add binary releases for them.