m-lab / etl-gardener

Gardener provides services for maintaining and reprocessing mlab data.
Apache License 2.0
13 stars 5 forks source link

"Jobs" property in Datastore exceeds 1048487 bytes #364

Closed robertodauria closed 2 years ago

robertodauria commented 2 years ago

After the recent addition of a new v2 parser for the switch datatype, I've started seeing these errors in etl-gardener-universal's logs, at 1-minute intervals:

2022-01-20 17:15:51.594 GMT
2022/01/20 17:15:51 tracker.go:134: rpc error: code = InvalidArgument desc = The value of property "Jobs" is longer than 1048487 bytes. 

The Jobs property of the tracker entity appears to contain the status of all Gardener jobs. With the addition of a new datatype, we went over the maximum size for a Datastore property.

[{"Job":{"Bucket":"archive-measurement-lab","Experiment":"utilization","Datatype":"switch","Date":"2019-06-20T00:00:00Z"},"State":{"HeartbeatTime":"2022-01-21T13:22:04.688625137Z","UpdateCount":374,"History":[{"State":"init","Start":"2022-01-21T13:21:17.946479008Z","DetailTime":"2022-01-21T13:21:18.007568791Z","Detail":"starting tasks"},{"State":"parsing","Start":"2022-01-21T13:21:18.007569154Z","DetailTime":"2022-01-21T13:22:14.973814476Z","Detail":""},{"State":"postProcessing","Start":"2022-01-21T13:22:14.973814957Z","DetailTime":"2022-01-21T13:22:14.973814957Z","Detail":""},{"State":"loading","Start":"2022-01-21T13:22:16.533091008Z","DetailTime":"2022-01-21T13:22:43.455992192Z","Detail":"Load took 18.6s (after 0s waiting), 3159270 rows with 12127050404 bytes, from 367 files with 22097940151 bytes"},{"State":"deduplicating","Start":"2022-01-21T13:22:43.455992598Z","DetailTime":"2022-01-21T13:22:52.095676294Z","Detail":"Dedup took 5.2s (after 0s waiting),  7.99 Slot Minutes, 0 Rows affected, 12127 MB Processed, 12127 MB Billed"},

[...]
stephen-soltesz commented 2 years ago

Related to

stephen-soltesz commented 2 years ago

Ah, it seems that part of the problem comes from our moving the the gardener start time -- so there are more jobs between "start time" and "now" AND more datatypes as well.