jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.69k stars 3.39k forks source link

Can Binary Releases be Done for aarch64 and armv7h #5450

Closed stormdragon2976 closed 3 years ago

stormdragon2976 commented 5 years ago

Aarch64 and Armv7h machines are becoming more and more common place now days. There are Windows, Mac, and Linux binaries, but the Linux binaries do not cover these architectures. Please add binary releases for them.

benz0li commented 3 years ago

Next steps on my part:

  1. Figure out the issue with gitlab/gitlab-runner for linux/arm64.
  2. Rebuild all Docker images and recreate all multi-arch manifests.

Then, the digests for all Docker images and multi-arch manifest are publicly available in the GitLab CI/CD Pipeline build logs.

jgm commented 3 years ago

Sorry: I am pretty ignorant about docker. I tried using just the tag originally, and it gave me the image for the wrong arch (amd64 instead of arm). Probably I'm doing something wrong: do I need to tell it explicitly which arch to use in docker run, and if so how? [EDIT: It seems to be working now. I was probably doing something else wrong before!]

znmeb commented 3 years ago

@benz0li I've built multi-arch Docker images before. Unless the setup has changed, they run on an x86_64 server and build the other architectures using a static QEMU layer to emulate them. The upside is that you can build a multi-arch image for anything QEMU can emulate - arm64, PowerPC, IBM Z series, etc. So they'll work for anything where your base image - Ubuntu or Debian - has the binary toolchain to build pandoc.

I vaguely recall trying it (for just arm64) on my workstation last fall when I got my new one. And I got it to run on Travis CI but it couldn't complete a few steps within their 50 minute time limit.

benz0li commented 3 years ago

@jgm My fault! The manifest for tag 8.10.4 got overwritten while creating tag 9.0.1 for os/arch linux/amd64. This is fixed with commit 26c291eb.

No, you don't have to tell which arch to use. If the manifest consists of multiple os/arch images, Docker pulls the image for the arch Docker is running on.

jgm commented 3 years ago

I ran out of memory on an 8 GB instance -- not even on one of the memory-hungry modules.

I'm a good way through a build now on a 16 GB t4g.xlarge AWS instance. top is showing a bit less than 6 GB free, 3-4 GB used, and 6-7 GB buffer/cache. On AMD, I can build on 4 GB, and even on 2 GB with some restarts. Is there an explanation of why we need so much more RAM to compile on ARM? (The 16 GB build machine costs 8 cents / hour more than the 8 GB -- not really a big deal if the build can complete in an hour or two -- but I'm still surprised by the memory needs.)

znmeb commented 3 years ago

@jgm I've never tried it on 8 GB - just 4, which fails, and 16, which works. My recollection is that the crashes are in the back end, and that I tried both the gcc and LLVM back ends. I don't remember if I ever got the native GHC back end to work or not, or which version of GHC I used - everything works in 16 GB with the GHC 8.0 that ships in Ubuntu 18.04 so I stopped experimenting.

benz0li commented 3 years ago

I can't explain the enormous memory needs while building pandoc on ARM. Tried on a t4g.xlarge (16 GB RAM) myself... and failed. Just started a new build with a 16 GB swapfile in addition.

I used some heavy machinery, t4g.2xlarge (8 vCPUS, 32 GB RAM + 16 GB swapfile), while building pandoc-2.11.4-1-arm64.deb as GHC Docker Images were built in parallel.

jgm commented 3 years ago

Same experience here on t4g.xlarge. Trying now with t4g.2xlarge! It may mean that creating the release package on arm costs $1-2. We can afford that. But it's strange, the difference between ARM and non-ARM. (The time it takes to compile is also much longer.)

znmeb commented 3 years ago

Interesting - my AGX Xavier has 16 GB of RAM, Ubuntu18.04 LTS for arm64, NVIDIA Docker runtime. The stock apt binaries are GHC 8.0 "unregisterised" - it doesn't have the native back end - and cabal-install 1.24. LLVM is 3.7

All I have to do to build Pandoc is upgrade cabal-installto 3.0 and do a cabal install pandoc. It takes overnight to do all that, even with the eight cores, but it does work. IIRC it works with either the GCC back end or the LLVM back end. And I can bootstrap up to GHC 8.2, or upgrade cabal-install and bootstrap up to GHC 8.4, etc. I haven't been able to get stackto work, but I don't need it.

My recollection from watching these things go by with top is that the memory hogs are the back ends, especially the gcc one, not GHC itself. If you've got connections in the GHC team and want me to capture any trace / debug info on this on the AGX Xavier, have them open an issue on https://github.com/edgyR/edgyR-containers/issues.

One other possibility for Ubuntu: there is a PPA on Launchpad with Haskell backports including Pandoc: https://launchpad.net/~savoury1/+archive/ubuntu/haskell-build/. It does not appear to be set up to build for arm64 but that may be changeable. That would use Ubuntu's build farm, I think.

jgm commented 3 years ago

I've been able to build with the 2xl instance. I'm adding a script to automate this process, and I'll try to include an arm64 binary in the next release. (We can then get statistics to see how much use it gets.)

benz0li commented 3 years ago

I have been able to build it with a t4g.xlarge AWS EC2 instance:

Build time: 5 hours.

benz0li commented 3 years ago

The issue with the GitLab CI/CD Pipeline is resolved, too. I am going to rebuild all images and multi-arch manifests for registry.gitlab.b-data.ch/ghc/ghc4pandoc tomorrow.

I will continue to maintain these images and keep them publicly available. @jgm Let me know if you need a version in-between 8.10.1 and 8.10.4.

znmeb commented 3 years ago

The issue with the GitLab CI/CD Pipeline is resolved, too. I am going to rebuild all images and multi-arch manifests for registry.gitlab.b-data.ch/ghc/ghc4pandoc tomorrow.

I will continue to maintain these images and keep them publicly available. @jgm Let me know if you need a version in-between 8.10.1 and 8.10.4.

Hmm ... maybe I should move my Ubuntu build chain to GitLab and do my R and RStudio builds there on top of your Pandoc build. That would check off a huge box for me; I could go with R from the Ubuntu source repos instead of from upstream. I only need the Jetson base / NVIDIA tools for the last step in my processing.

benz0li commented 3 years ago

@jgm pandoc-2.12-linux-arm64.tar.gz seems to be broken:

$ tar -xfz pandoc-2.12-linux-arm64.tar.gz

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
jgm commented 3 years ago

Oh, weird! I've manually unpacked the deb and created a tarball from it, replacing the one on releases. @benz0li can you try again?

I think I can guess what happened. My build script waits until the .tar.gz appears in artifacts, and then downloads it. Perhaps it downloaded it while it was incomplete? IN any case the build script needs some improvement.

benz0li commented 3 years ago

Upacking works fine now.

But I'm not sure, if this is the intended structure and content for the arm64 release:

$ tree pandoc-2.12-arm64 
pandoc-2.12-arm64
├── bin
│   └── pandoc
└── tmp
    └── usr
        ├── bin
        │   └── pandoc
        └── share
            ├── doc
            │   └── pandoc
            │       └── copyright
            └── man
                └── man1
                    └── pandoc.1.gz

9 directories, 4 files

Compared to the amd64 release:

$ tree pandoc-2.12-amd64 
pandoc-2.12-amd64
├── bin
│   └── pandoc
└── share
    └── man
        └── man1
            └── pandoc.1.gz

4 directories, 2 files
jgm commented 3 years ago

No, that's not right. Let me fix it manually; I don't want to wait another 3 hours for a build to complete!

jgm commented 3 years ago

@benz0li Good news. I tried tweaking a few settings and now the build takes one hour 15 minutes. What I did was increase heap size for the compiler and (more significantly) told it to use 4 cores instead of 1. (With this build machine we could probably go even higher.)

znmeb commented 3 years ago

My experiments didn't gain much from multiple cores. A bunch of small-ish libraries get built at the beginning, but once they're done, it's single-threaded for the large ones.

benz0li commented 3 years ago

@jgm More good news.

A linux/arm64-build on a Mac mini (Apple M1 chip, 16 GB memory) using the tech preview of Docker Desktop for Apple M1 takes just under an hour:

make_debpkg

Docker settings used:

Docker_preferences

ℹ️ Memory increased from 4 GB to 16 GB; Swap increased from 1 GB to 4 GB; CPUs left at 4 👉 the Mac mini M1 has 4 performance cores.

Bearing in mind that Docker Desktop for Apple M1 runs in a virtual machine, which uses the new Virtualization Framework, the performance is outstanding.

jgm commented 3 years ago

Nice, so this produces a native M1 build? (See #6960)

benz0li commented 3 years ago

No. Docker Desktop for Apple M1 is effectively linux/arm64:

Virtualization

Create virtual machines and run Linux-based operating systems.


Overview

The Virtualization framework provides high-level APIs for creating and managing virtual machines on Apple silicon and Intel-based Mac computers. Use this framework to boot and run a Linux-based operating system in a custom environment that you define. The framework supports the Virtio specification, which defines standard interfaces for many device types, including network, socket, serial port, storage, entropy, and memory-balloon devices.

Developer > Documentation > Virtualization

znmeb commented 3 years ago

@benz0li Is the Apple operating system shim as light-weight as the Windows version of Docker Desktop, which uses WSL 2?