Closed kohlisid closed 3 months ago
Pipeline spec used
apiVersion: numaflow.numaproj.io/v1alpha1
kind: Pipeline
metadata:
name: simple-pipeline
spec:
vertices:
- name: in
scale:
min: 10
max: 10
source:
# A self data generating source
generator:
msgSize: 500
rpu: 5000
duration: 1s
value: 100
containerTemplate:
resources:
limits:
cpu: "4"
memory: 8Gi
requests:
cpu: "2"
memory: 4Gi
- name: batch-cat
metadata:
annotations:
numaflow.numaproj.io/batch-map: "true"
partitions: 12
scale:
min: 18
max: 18
udf:
container:
image: quay.io/kohlisid/numaflow-go/batch-map-flatmap:test1
resources:
limits:
cpu: "4"
memory: 16Gi
requests:
cpu: "2"
memory: 8Gi
containerTemplate:
resources:
limits:
cpu: "4"
memory: 8Gi
requests:
cpu: "2"
memory: 4Gi
- name: out
partitions: 12
scale:
min: 6
max: 6
sink:
# A simple log printing sink
blackhole: {}
containerTemplate:
resources:
limits:
cpu: "4"
memory: 8Gi
requests:
cpu: "2"
memory: 4Gi
edges:
- from: in
to: batch-cat
- from: batch-cat
to: out
kubectl top pods
simple-pipeline-batch-cat-0-0mcfa 1191m 73Mi
simple-pipeline-batch-cat-1-swdiw 1147m 69Mi
simple-pipeline-batch-cat-10-4yt10 1154m 73Mi
simple-pipeline-batch-cat-11-omegc 1039m 69Mi
simple-pipeline-batch-cat-12-riuav 1201m 74Mi
simple-pipeline-batch-cat-13-ebhjb 1131m 69Mi
simple-pipeline-batch-cat-14-x1jm8 1208m 67Mi
simple-pipeline-batch-cat-15-0onio 1152m 68Mi
simple-pipeline-batch-cat-16-vyegw 1066m 76Mi
simple-pipeline-batch-cat-17-wpmkn 1062m 71Mi
simple-pipeline-batch-cat-2-o8yxk 1064m 71Mi
simple-pipeline-batch-cat-3-032nh 1041m 70Mi
simple-pipeline-batch-cat-4-juiek 1095m 72Mi
simple-pipeline-batch-cat-5-8ckf1 1082m 66Mi
simple-pipeline-batch-cat-6-mbqdv 1101m 69Mi
simple-pipeline-batch-cat-7-f6xtg 1131m 69Mi
simple-pipeline-batch-cat-8-jgjli 1150m 68Mi
simple-pipeline-batch-cat-9-yxyf9 981m 66Mi
simple-pipeline-daemon-854ff49886-f2pfg 181m 58Mi
simple-pipeline-in-0-fdyh6 550m 32Mi
simple-pipeline-in-1-xgzj3 527m 34Mi
simple-pipeline-in-2-fezzd 508m 33Mi
simple-pipeline-in-3-pnavp 471m 34Mi
simple-pipeline-in-4-jiwm8 503m 32Mi
simple-pipeline-in-5-trjds 473m 32Mi
simple-pipeline-in-6-rmhql 454m 34Mi
simple-pipeline-in-7-hf0ll 536m 34Mi
simple-pipeline-in-8-zxc3s 442m 33Mi
simple-pipeline-in-9-ea3dl 485m 35Mi
simple-pipeline-out-0-npy4r 1524m 66Mi
simple-pipeline-out-1-uwkuh 1601m 58Mi
simple-pipeline-out-2-reeaw 1446m 63Mi
simple-pipeline-out-3-y16bh 1539m 66Mi
simple-pipeline-out-4-hincx 1541m 65Mi
simple-pipeline-out-5-sjzpr 1466m 63Mi
UDF processing time
@vigith @whynowy What other details would you like to attach here? cc @numaproj/numaflow-dev
why was there a spike on Monday?
@KeranYang Pod migrations on the cluster,
"reason":"EvictionByEvictionAPI","message":"Eviction API: evicting"
No restarts or errors seen in the pods
simple-pipeline-batch-cat-0-0mcfa 2/2 Running 0 32h
simple-pipeline-batch-cat-1-swdiw 2/2 Running 0 28h
simple-pipeline-batch-cat-10-4yt10 2/2 Running 0 30h
simple-pipeline-batch-cat-11-omegc 2/2 Running 0 31h
simple-pipeline-batch-cat-12-riuav 2/2 Running 0 32h
simple-pipeline-batch-cat-13-ebhjb 2/2 Running 0 28h
simple-pipeline-batch-cat-14-x1jm8 2/2 Running 0 31h
simple-pipeline-batch-cat-15-0onio 2/2 Running 0 33h
simple-pipeline-batch-cat-16-vyegw 2/2 Running 0 28h
simple-pipeline-batch-cat-17-wpmkn 2/2 Running 0 28h
simple-pipeline-batch-cat-2-o8yxk 2/2 Running 0 30h
simple-pipeline-batch-cat-3-032nh 2/2 Running 0 29h
simple-pipeline-batch-cat-4-juiek 2/2 Running 0 32h
simple-pipeline-batch-cat-5-8ckf1 2/2 Running 0 28h
simple-pipeline-batch-cat-6-mbqdv 2/2 Running 0 29h
simple-pipeline-batch-cat-7-f6xtg 2/2 Running 0 28h
simple-pipeline-batch-cat-8-jgjli 2/2 Running 0 31h
simple-pipeline-batch-cat-9-yxyf9 2/2 Running 0 29h
simple-pipeline-daemon-854ff49886-f2pfg 1/1 Running 0 32h
simple-pipeline-in-0-fdyh6 1/1 Running 0 32h
simple-pipeline-in-1-xgzj3 1/1 Running 0 30h
simple-pipeline-in-2-fezzd 1/1 Running 0 28h
simple-pipeline-in-3-pnavp 1/1 Running 0 33h
simple-pipeline-in-4-jiwm8 1/1 Running 0 29h
simple-pipeline-in-5-trjds 1/1 Running 0 28h
simple-pipeline-in-6-rmhql 1/1 Running 0 29h
simple-pipeline-in-7-hf0ll 1/1 Running 0 27h
simple-pipeline-in-8-zxc3s 1/1 Running 0 28h
simple-pipeline-in-9-ea3dl 1/1 Running 0 30h
simple-pipeline-out-0-npy4r 1/1 Running 0 32h
simple-pipeline-out-1-uwkuh 1/1 Running 0 30h
simple-pipeline-out-2-reeaw 1/1 Running 0 28h
simple-pipeline-out-3-y16bh 1/1 Running 0 33h
simple-pipeline-out-4-hincx 1/1 Running 0 31h
simple-pipeline-out-5-sjzpr 1/1 Running 0 28h
Good, no containers were ever restarted :)
@kohlisid, please paste the ISB spec for reference, too.
ISB Spec
apiVersion: numaflow.numaproj.io/v1alpha1
kind: InterStepBufferService
metadata:
name: default
spec:
jetstream:
version: 2.10.11
startArgs:
replicas: 3
persistence:
storageClassName: gp3
accessMode: ReadWriteOnce
volumeSize: 40Gi
containerTemplate:
resources:
limits:
memory: 16384Mi
requests:
cpu: 8
memory: 16384Mi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: isbsvc
numaflow.numaproj.io/isbsvc-name: fci-session
topologyKey: topology.kubernetes.io/zone
weight: 100
All green on the endurance test!
Running an endurance test on a pipeline with constant load of 50k for 5 days Attached is the read rate for the vertex