m-lab / etl

M-Lab ingestion pipeline
Apache License 2.0
22 stars 7 forks source link

K8S parser failing file writes in staging #983

Closed gfr10598 closed 3 years ago

gfr10598 commented 3 years ago

Probably due to changing node-pools when CPUs changed from 7 to 15.

gfr10598 commented 3 years ago

Extracted create-parser-pool.sh and updated to correct the storage-rw scope. Located in etl-gardener/cluster-setup-script branch https://github.com/m-lab/etl-gardener/tree/cluster-setup-script

gfr10598 commented 3 years ago

Used script to update parser-pool. Set parser-pool1 size to zero. Pods migrated and working now.

gfr10598 commented 3 years ago

TODO: This type of error should show up in the gardener pipeline overview dashboard.

laiyi-ohlsen commented 3 years ago

@gfr10598 Can you make a new issue for the remaining to do and resolve this issue (if resolved)?

gfr10598 commented 3 years ago

The Pipeline Overview prototype dashboard not includes these write failure metrics.

https://grafana.mlab-sandbox.measurementlab.net/d/tDpMHylGk/pipeline-overview?orgId=1&refresh=5m&var-project=mlab-sandbox&var-interval=10m&var-PrometheusDS=Prometheus%20(mlab-sandbox)&var-LegacyDS=Data%20Proc%20(mlab-sandbox)&var-Gardener2_DS=Data%20Processing%20(mlab-sandbox)&var-states=Done&var-states=Processing