googleapis / google-cloud-php

Google Cloud Client Library for PHP
https://cloud.google.com/php/docs/reference
Apache License 2.0
1.1k stars 436 forks source link

Internal query iterator's batch size limit is undefined (Firestore Datastore) #7852

Open calsmith opened 10 hours ago

calsmith commented 10 hours ago

This library returns an iterator for queries. If no limit is specified, the iterator will continue until there are no results remaining e.g.

$datastore = new DatastoreClient();
$query = $datastore->query()
    ->kind('Task')
    ->filter('done', '=', false);

$iterator = $datastore->runQuery($query);
foreach ($iterator as $entity) {
    ...
}

What is the default value of limit if none is set? In other words, what is the max size of batches (pages) used by the internal iterator?

This is not documented in limits or in the Query reference.

Why this matters? We need to know that the max value of limit so we can maximize throughput performance of batch processing of query results. There are some mentions of setBatchSize() in the Python library but this is not documented anywhere either. Additionally, we often need to batchInsert() n entities using results from the iterator and need to know the max size of each page.

Thank you