Closed dioguerra closed 4 years ago
Please provide more details on how to reproduce the problem (or what the problem is exactly). I've just finished a testing build where I disconnected a node while it was running builds, and the build finished successfully by retrying those builds locally (which is good enough, given that this is an exceptional situation).
In your environment might not make sense, but in a cluster with Auto-scaling of worker nodes (where they are created and deleted) this has impact.
HOW TO REPRODUCE: First build your image
FROM fedora:31
# Build environment
RUN dnf install icecream -y && \
dnf install clang -y && \
dnf install doxygen -y && \
dnf install gcc -y && \
dnf install graphviz -y && \
dnf install libasan -y && \
dnf install libasan-static -y && \
dnf install libedit-devel -y && \
dnf install libxml2-devel -y && \
dnf install make -y && \
dnf install net-tools -y && \
dnf install python-devel -y && \
dnf install swig -y && \
dnf install git bc xz -y
RUN dnf group install "Development Tools" -y && \
dnf group install "C Development Tools and Libraries" -y && \
dnf install cmake ninja-build ncurses-devel bison flex elfutils-libelf-devel openssl-devel -y
# Run icecc daemon in verbose mode
#ENTRYPOINT ["iceccd","-v"]
#ENTRYPOINT ["icecc-scheduler","-v"]
# iceccd port
EXPOSE 10245 8765/TCP 8765/UDP 8766
# If no-args passed, make very verbose
#CMD ["-vvv"]
Then, create the resources on your cluster. If you don't have access to a kubernetes clustes you can use your local computer with minikube. Though the per pod resource request and limitation should be a fraction of your available CPU
apiVersion: v1
kind: Namespace
metadata:
name: division
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: icecc-division-scheduler
namespace: division
labels:
app: icecc
spec:
replicas: 1
selector:
matchLabels:
app: icecc-scheduler-division-user
template:
metadata:
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
labels:
app: icecc-scheduler-division-user
spec:
hostname: scheduler-division-user
containers:
- name: icecc-scheduler-division-user
image: icecc:fedora31
command:
- /bin/bash
- -c
- "iceccd -vvv -m 8 -d && icecc-scheduler -vvv"
# args:
# -
# - -n ICECREAM
# - -l /dev/stdout
env:
- name: ICECREAM_SCHEDULER_LOG_FILE
value: "/dev/stdout"
- name: ICECREAM_MAX_JOBS
value: "3"
- name: ICECREAM_NETNAME
value: "division-user"
resources:
limits:
cpu: 8
memory: 8Gi
ports:
# Daemon computers
- containerPort: 10245
# Scheduler computer
- containerPort: 8765
# broadcast to find the scheduler (optional)
- containerPort: 8765
protocol: UDP
# telnet interface to the scheduler (optional)
- containerPort: 8766
---
apiVersion: v1
kind: Service
metadata:
labels:
app: icecc-scheduler-division-user
name: icecc-division-scheduler
namespace: division
spec:
ports:
- port: 10245
name: daemon
protocol: TCP
targetPort: 10245
- port: 8765
name: scheduler
protocol: TCP
targetPort: 8765
- port: 8765
name: broadcast
protocol: UDP
targetPort: 8765
- port: 8766
name: telnet
protocol: TCP
targetPort: 8766
selector:
app: icecc-scheduler-division-user
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: icecc-division-worker
namespace: division
labels:
app: icecc
spec:
replicas: 1
selector:
matchLabels:
app: icecc-worker-division-user
template:
metadata:
labels:
app: icecc-worker-division-user
spec:
containers:
- name: icecc-worker-division-user
image: icecc:fedora31
command:
- /bin/bash
args:
- -c
- "iceccd -vvv -m 1 -s $(host -4 icecc-division-scheduler | awk '{print $4}')"
# - "iceccd -vvv -m 1 -s 192.168.1.68"
env:
- name: ICECREAM_LOG_FILE
value: "/dev/stdout"
- name: ICECREAM_MAX_JOBS
value: "3"
- name: ICECREAM_NETNAME
value: "division-user"
# - name: ICECREAM_SCHEDULER_HOST
# value: icecc-division-scheduler.division.svc.cluster.local
resources:
requests:
cpu: 1
limits:
cpu: 1
ports:
# Daemon computers
- containerPort: 10245
# Scheduler computer
- containerPort: 8765
# broadcast to find the scheduler (optional)
- containerPort: 8765
protocol: UDP
# telnet interface to the scheduler (optional)
- containerPort: 8766
---
apiVersion: v1
kind: Service
metadata:
labels:
app: icecc-worker-division-user
name: icecc-division-worker
namespace: division
spec:
ports:
- port: 10245
name: daemon
protocol: TCP
targetPort: 10245
- port: 8765
name: scheduler
protocol: TCP
targetPort: 8765
- port: 8765
name: broadcast
protocol: UDP
targetPort: 8765
- port: 8766
name: telnet
protocol: TCP
targetPort: 8766
selector:
app: icecc-worker-division-user
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: icecc-division-worker
namespace: division
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: icecc-division-worker
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 90
minReplicas: 1
maxReplicas: 15
The way to trigger job build is by entering your scheduler pod:
kubectl -n division exec -it pod/icecc-itcm-scheduler-<59f747bfb7-kp2g2> -- bash
and execute your compiler. You can test with the instructions bellow:
git clone https://github.com/llvm/llvm-project.git ~/dev/llvm-project
mkdir -p ~/dev/llvm-builds/release-gcc-distcc
cd ~/dev/llvm-builds/release-gcc-distcc
export CC="/usr/bin/gcc"
export CXX="/usr/bin/g++"
cmake ~/dev/llvm-project/llvm \
-G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_USE_LINKER=gold \
-DLLVM_ENABLE_PROJECTS="lldb;clang;lld" \
-DCMAKE_EXPORT_COMPILE_COMMANDS=1 \
-DCMAKE_C_COMPILER_LAUNCHER="icecc " \
-DCMAKE_CXX_COMPILER_LAUNCHER="icecc "
time ninja -j 100
I would say that, the server should try to reschedule again N times on worker nodes AND maintain a tighter health check status of the attached worker pods
Handling the Assembly job on the scheduler is OK if one of the pods fails, but if multiple are scaled down at the same time for lack of work (3 or 5 which corresponds to 3 to 5 cores in this example) the compilation job migrates every task to the scheduler. Thus not using the still available (not scaled) worker nodes
Sorry, but I find this setup to be so niche and the possible gain so small that the effort/gain ratio is way beyond reasonable for me. Feel free to improve the handling and submit patches.
Closing, as per my above comment.
@dioguerra did you get it solved? We're looking into running icecc on kubernetes with a cluster-autoscaler setup, and I was wondering if you managed to find a better solution?
IMO what would be good is the following approach (I haven't checked what of this is actually implemented, so parts of it might not make sense)
The will help in two scenarios:
Second scenario:
Hey @marscher ,
to be honest i didn't pursue this further as i had no time, this was a side project and motivation ran low. I didnt think on using the SIGTERM signal tho, thats a good idea.
If you get something to work, can you keep me posted?
icecc-scheduler does not handle very well iceccd disconnecting suddently. This causes the compiler to halt assembly distribution if the iceccd clients (few) disconnect suddently. As i understand this happens because:
In my env. (Kubernetes) this is caused because of the workers waiting for new Assemble instruction while scheduler is linking objects. So the worker nodes are killed by the Pod autoscaler
With https://github.com/icecc/icecream/issues/482 this would be mitigated as workers can offload the scheduler