googleapis / google-cloud-php

Google Cloud Client Library for PHP
https://cloud.google.com/php/docs/reference
Apache License 2.0
1.09k stars 436 forks source link

BigQueryClient runQuery frozen #3300

Closed dschott68 closed 3 years ago

dschott68 commented 4 years ago

Issue summary: runQuery freezes never returning results Affected timeframe: 2020-08-14 06:04:09 to 2020-08-14 06:10:03 Steps to reproduce: Execute a MERGE command using PHP API library BigQueryClient->runQuery. What is the business impact you are facing?:

This type of frozen runQuery command happens about 5-10 times per day out of several hundred MERGE statements per day.

Note we call with the following options set: 'timeoutMs' => 600000, 'maxRetries' => 6

Here is a job that successfully completed in 6 minutes, yet that runQuery call never returned so code is stuck.

select creation_time, project_id, project_number, statement_type, priority, start_time, end_time, query, state, reservation_id, total_bytes_processed, total_slot_ms, error_result, cache_hit, destination_table from `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT where job_id = '790528e0-c203-4d11-93d0-bc43893eeb7b'
 [
  {
    "creation_time": "2020-08-14 06:04:09.368 UTC",
    "project_id": "etsy-schlep-bq-prod",
    "project_number": "298315228888",
    "statement_type": "MERGE",
    "priority": "INTERACTIVE",
    "start_time": "2020-08-14 06:04:10.113 UTC",
    "end_time": "2020-08-14 06:10:03.178 UTC",
    "query": "\n                MERGE `etsy-data-warehouse-prod.etsy_shard.listings` D\n                USING `etsy-schlep-bq-prod.s_etsy_shard.s_listings` S\n                ON D.`shop_id` = S.`shop_id` AND D.`listing_id` = S.`listing_id`\n                AND D.`create_date` BETWEEN 0 AND 1597384279\n                AND S.`create_date` BETWEEN 0 AND 1597384279\n                WHEN MATCHED AND S.`create_date` BETWEEN 0 AND 1597384279\n                AND D.`create_date` BETWEEN 0 AND 1597384279 THEN\n                UPDATE SET `listing_id` = S.`listing_id`, `shop_id` = S.`shop_id`, `user_id` = S.`user_id`, `state` = S.`state`, `seller_flags` = S.`seller_flags`, `admin_flags` = S.`admin_flags`, `is_available` = S.`is_available`, `is_displayable` = S.`is_displayable`, `is_searchable` = S.`is_searchable`, `title` = S.`title`, `description` = S.`description`, `price` = S.`price`, `currency_code` = S.`currency_code`, `quantity` = S.`quantity`, `section_id` = S.`section_id`, `featured_rank` = S.`featured_rank`, `custom_rank` = S.`custom_rank`, `favorites_count` = S.`favorites_count`, `non_taxable` = S.`non_taxable`, `image_key` = S.`image_key`, `state_date` = S.`state_date`, `create_date` = S.`create_date`, `ending_date` = S.`ending_date`, `original_create_date` = S.`original_create_date`, `update_date` = S.`update_date`, `listing_type` = S.`listing_type`, `shipping_profile_id` = S.`shipping_profile_id`, `rearrange_rank` = S.`rearrange_rank`\n                WHEN NOT MATCHED AND `create_date` BETWEEN 0 AND 1597384279 THEN\n                INSERT ROW\n            ",
    "state": "DONE",
    "reservation_id": "etsy-billing-prod:US.schlep",
    "total_bytes_processed": "538654334517",
    "total_slot_ms": "92141715",
    "error_result": null,
    "cache_hit": false,
    "destination_table": {
      "project_id": "etsy-data-warehouse-prod",
      "dataset_id": "etsy_shard",
      "table_id": "listings"
    }
  }
]
jdpedrie commented 4 years ago

Have you tried using BigQueryClient::startQuery() and polling the job status manually?

$job = $bigquery->startQuery();

// cache this in case you need to re-create the job object
// i.e. $bigquery->job($jobId);
$jobId = $job->id();

$results = $job->queryResults();

do {
    $results->reload();
    $complete = $results->isComplete();

    if (!$complete) {
        sleep(10);
    }
} while (!$complete);

// Use job results.

The main advantage here would be that you could get more insight into what the problem may be, and you could cache the Job ID so if you needed to restart the script you could gain a job reference.

dschott68 commented 4 years ago

We switched to startQuery with ExponentialBackoff per https://github.com/GoogleCloudPlatform/php-docs-samples/blob/master/bigquery/api/src/run_query_as_job.php a few days ago and it never freezes up anymore. We will use that long term. Just super interested into why runQuery logic was freezing up on a few random calls. The implementation looks straightforward and we had both timeoutMs and maxRetries set.

dwsupplee commented 3 years ago

I've opened https://github.com/googleapis/google-cloud-php/pull/3626 to add a note about preferring startQuery for more rigorous queries. I'm going to close this out, but if you're running into issues here as well we can re-open to look deeper. Providing some sample data for us to reproduce would be greatly appreciated.