pandoc / dockerfiles

Dockerfiles for various pandoc images
GNU General Public License v2.0
364 stars 98 forks source link

add argument to control the number of concurrent jobs #156

Closed zkamvar closed 2 years ago

zkamvar commented 2 years ago

I was trying to build the core image from the dockerfile so that I could debug an error that came up during the nightly builds (the latest docker image is 21 days old and does not produce the error) and my laptop started freezing up because all threads were being used. I was digging and it looks like cabal is using the bare --jobs flag, which uses all available CPUs.

I have two proposals:

  1. set jobs: $ncpus - 1 in cabal.root.config (idk how feasible this is) https://github.com/pandoc/dockerfiles/blob/3b8ea8a61e0b479bdae3f08f050ca994334097e5/cabal.root.config#L15
  2. provide a make variable that defines the number of concurrent jobs.
tarleb commented 2 years ago

Would it make sense to handle this by setting the number of CPUs used by Docker instead? https://docs.docker.com/config/containers/resource_constraints/

zkamvar commented 2 years ago

Would it make sense to handle this by setting the number of CPUs used by Docker instead? https://docs.docker.com/config/containers/resource_constraints/

Yes. I think that would be a good solution.

zkamvar commented 2 years ago

Actually, no. This is a runtime option, not a build option. I attempted to add --cpus=5 to this docker command and I got an "unknown flag" error

https://github.com/pandoc/dockerfiles/blob/1249b304417a986b603f2b02cddc764eea280f57/Makefile#L117-L124

tarleb commented 2 years ago

docker build --help mentions --cpu-quota, --cpu-period, and --cpuset-cpus. I don't know what they do, but they look like at least one of them should be helpful.

zkamvar commented 2 years ago

Going off the documentation that --cpus=1.5 is equivalent to --cpu-quota="150000" --cpu-period="100000", attempted to allocate 5 (out of 8) cpus like so

$ STACK=ubuntu THREADS=5 make -n core
docker build \
    --cpu-period="100000" \
    --cpu-quota="$(( 5 * 100000 ))" \
    --tag pandoc/ubuntu:edge \
    --build-arg pandoc_commit=master \
    --build-arg pandoc_version=edge \
    --build-arg without_crossref= \
    --build-arg extra_packages="pandoc-crossref" \
    --target ubuntu-core \
    -f pandoc/dockerfiles//ubuntu/Dockerfile pandoc/dockerfiles/

And it did behave to a degree, but I had failed to take into account the memory usage, which I should have realized was the problem -_-

screenshot of htop showing several threads being used for ghc

Nevertheless, would you like me to make a PR with a THREADS argument?

tarleb commented 2 years ago

Not sure if it would be better to have an easy to use THREADS argument or a more general DOCKER_BUILD_ARGS or similar. Either way, a PR would be very welcome! :relaxed: