Decommission v1 data pipeline

stephen-soltesz commented 2 years ago

[x] Send note to discuss@ regarding deprecation - due after May 9th.
[x] Eliminate all references to v1 tables from unified views
[x] Delete v1 pipeline gardener deployments
- [x] ndt5, tcpinfo, paris1, ndt.web100, switch, sidestream from data-processing-cluster
[x] Delete v1 pipeline from App Engine
- [x] etl-batch-parser
- [x] queue-pusher
- [x] annotator (confirm no more requests coming in)
[x] Migrate downloader service to data-processing cluster with v2 pipeline.
- [x] sandbox
- [x] staging
- [x] production
[x] Delete data-processing-cluster

stephen-soltesz commented 2 years ago

https://github.com/m-lab/prometheus-support/releases/tag/v2.56.0 https://github.com/m-lab/etl-schema/releases/tag/v3.41.0

stephen-soltesz commented 2 years ago

From mlab-staging unified downloads daily count - verifying the web100_static update is WAI.

stephen-soltesz commented 2 years ago

From measurement-lab unified uploads after deployment to production/public views.

stephen-soltesz commented 2 years ago

Evidently, the v2 pipeline parsers in production are using the default value of the annotatorURL: https://github.com/m-lab/etl/blob/master/cmd/etl_worker/etl_worker.go#L71 which targets mlab-sandbox by default...

These requests should be no-ops (just burned cycles) and corrected by https://github.com/m-lab/etl/pull/1078 But, this was unexpected/unintended cross-project configuration and must be resolved before deleting the annotation-service in sandbox.

To complete:

complete fix
deploy fix to production
delete annotation-service in sandbox (and other projects)

stephen-soltesz commented 2 years ago

I have confirmed that the only IPs accessing the annotation-service according to the request logs are from GKE nodes in mlab-oti. No other requests are directed to the /batch_annotation resource.

stephen-soltesz commented 2 years ago

After deploying https://github.com/m-lab/etl/pull/1078, the production v2 pipeline is no longer targeting the mlab-sandbox annotation-service.

There are no requests to the annotation service in mlab-staging or mlab-oti either. So, it should be safe to delete this service now..

stephen-soltesz commented 2 years ago

After staging deployment of above PRs, I've confirmed downloader_last_success_time_seconds is still in staging prometheus after deleting the old downloader from data-processing-cluster. Also confirmed updates in gs://downloader-mlab-staging.

stephen-soltesz commented 2 years ago

Since deleting the v1 data pipeline components, the burn rate has reduced ~$6k/day

stephen-soltesz commented 2 years ago

\o/ - Fin.

m-lab / etl

Decommission v1 data pipeline #1074