We're currently having an issue where attempting to run a full backfill (over all data from 2018-2023) runs a job that succeeds, but which ends up having dropped 2/3 of the expected rows. The dropping is not spread out evenly, but is caused by 2/3 of server ips to be dropped entirely.
Here's an example of that the missing data look like:
Running over only smaller amounts of data causes the jobs to correctly write all the data. In particular writing the data out one year per job causes it to work correctly.
One thing we're seeing in the jobs with issues is the scaling message
Autoscaling: Unable to reach resize target in zone us-east1-c. QUOTA_EXCEEDED: Instance 'abc' creation failed: Quota 'IN_USE_ADDRESSES' exceeded. Limit: 575.0 in region us-east1.
we're also seeing the error
Autoscaling: Unable to reach resize target in zone us-east1-c. ZONE_RESOURCE_POOL_EXHAUSTED_WITH_DETAILS: Instance 'abc' creation failed: The zone 'projects/censoredplanet-analysisv1/zones/us-east1-c' does not have enough resources available to fulfill the request. '(resource type:compute)'.
The job is also not scaling to as many workers at it wants
We're currently having an issue where attempting to run a full backfill (over all data from 2018-2023) runs a job that succeeds, but which ends up having dropped 2/3 of the expected rows. The dropping is not spread out evenly, but is caused by 2/3 of server ips to be dropped entirely.
Here's an example of that the missing data look like:
Running over only smaller amounts of data causes the jobs to correctly write all the data. In particular writing the data out one year per job causes it to work correctly.
Example that succeeded:
Example that dropped data:
One thing we're seeing in the jobs with issues is the scaling message
Autoscaling: Unable to reach resize target in zone us-east1-c. QUOTA_EXCEEDED: Instance 'abc' creation failed: Quota 'IN_USE_ADDRESSES' exceeded. Limit: 575.0 in region us-east1.
we're also seeing the errorAutoscaling: Unable to reach resize target in zone us-east1-c. ZONE_RESOURCE_POOL_EXHAUSTED_WITH_DETAILS: Instance 'abc' creation failed: The zone 'projects/censoredplanet-analysisv1/zones/us-east1-c' does not have enough resources available to fulfill the request. '(resource type:compute)'.
The job is also not scaling to as many workers at it wants