autopilotpattern / jenkins

Extension of official Jenkins Docker image that supports Joyent's Triton for elastic slave provisioning
43 stars 10 forks source link

Is proclimit script too conservative? #4

Closed tgross closed 8 years ago

tgross commented 8 years ago

I was doing an experiment on having applications automatically scale threads based on available bursting capacity and this caused me to circle back to the proclimit script in this repo. The number of threads we're assuming here seems at odds with the number of available vCPU for a given container. I took the typical values for a Joyent CN and ran it against this calculation (numbers in GB for clarity):


#!/bin/bash

TOTAL_MEM=256.0 # typical CN size in GB
CORES=48        # typical number of cores

calc_mem() {
      local zone_mem=$1
      local expected=$2
      local got=$(echo "8k $zone_mem $TOTAL_MEM / $CORES * pq" | dc)
      echo "For $zone_mem GB zone, expected $expected vCPU, got $got"
}

# Docker container sizes on public cloud
calc_mem 0.128 0.0625
calc_mem 0.256 0.125
calc_mem 0.512 0.25
calc_mem 1 0.5
calc_mem 2 1
calc_mem 8 4
calc_mem 16 8
calc_mem 32 16
calc_mem 64 32
./proclimit.sh
For 0.128 GB zone, expected 0.0625 vCPU, got .02400000
For 0.256 GB zone, expected 0.125 vCPU, got .04800000
For 0.512 GB zone, expected 0.25 vCPU, got .09600000
For 1 GB zone, expected 0.5 vCPU, got .18750000
For 2 GB zone, expected 1 vCPU, got .37500000
For 8 GB zone, expected 4 vCPU, got 1.50000000
For 16 GB zone, expected 8 vCPU, got 3.00000000
For 32 GB zone, expected 16 vCPU, got 6.00000000
For 64 GB zone, expected 32 vCPU, got 12.00000000

It looks to me like we're being overly conservative with this calculation and that it's worse with larger instance sizes because we take a floor of 1. Any thoughts on this @dekobon or @misterbisson ?

tgross commented 8 years ago

Did we ever come to a resolution with this? There's an open JIRA issue somewhere (I can't find the link) but I don't think we ever got a good answer.

dekobon commented 8 years ago

I'm proposing that this script will better represent the number of threads that we want to tune to because it bases its calculation off of the CPU cap and not the amount of memory available.

#!/usr/bin/env sh

##
# When this script is invoked inside of a zone:
#
# This script returns a number representing a very conservative estimate of the
# maximum number of processes or threads that you want to run within the zone
# that invoked this script. Typically, you would take this value and define a
# multiplier that works well for your application.
#
# Otherwise:
# This script returns the number of cores reported by the OS.

# If we are on a LX Brand Zone calculation value using utilities only available in the /native
# directory

if [ -d /native ]; then
  PATH=/native/sbin:/native/usr/bin:/native/sbin:$PATH
fi

KSH="$(which ksh93)"
PRCTL="$(which prctl)"

if [ -n "${KSH}" ] && [ -n "${PRCTL}" ]; then
  CAP=$(${KSH} -c "echo \$((\$(${PRCTL} -n zone.cpu-cap \$\$ | grep privileged | awk '{ print \$2 }') / 100))")

  # If there is no cap set, then we will fall through and use the other functions
  # to determine the maximum processes.
  if [ -n "${CAP}" ]; then
    $KSH -c "echo \$((ceil(${CAP})))"
    exit 0
  fi
fi

# Linux calculation if you have nproc
if [ -n "$(which nproc)" ]; then
  nproc
  exit 0
fi

# Linux more widely supported implementation
if [ -f /proc/cpuinfo ] && [ -n $(which wc) ]; then
  grep processor /proc/cpuinfo | wc -l
  exit 0
fi

# OS X calculation
if [ "$(uname)" == "Darwin" ]; then
  sysctl -n hw.ncpu
  exit 0
fi

# Fallback value if we can't calculate
echo 1