googleapis / python-firestore

Apache License 2.0
214 stars 74 forks source link

504 Deadline Exceeded: BigQuery to Firestore Collection using Python in Batch Writes Mode #873

Open sanket-coditas opened 6 months ago

sanket-coditas commented 6 months ago

Hi,

I am working on the setup as mentioned below and getting the error: "504 Deadline Exceeded"

  1. Read approx. 10k rows from BigQuery in Cloud Function using Python
  2. Write those 10k rows to a Firestore collection using the below code - -- Batch size is selected as 500 -- It forms smaller batches (approx. 20) and loads the data
max_batch_size = 500
for i in range(0, len(records), max_batch_size):
    batch_number += 1
    batch = firestore_client.batch()
    for data in records[i:i+max_batch_size]:
        doc_ref = collection_ref.document()
        batch.set(doc_ref, data)
    try:
        batch.commit()
    except GoogleAPIError as error:
        print(f"An error occurred: {error}")

I am consistently facing the "504 Deadline Exceeded" error.

Things that I have already tried -

  1. Changing the batch size to 1000, it still fails. Changing to 200, it becomes very slow and sometimes fails as well.
  2. Adding index to the Firestore collection (? Not sure about this since adding/removing index would help)

A few questions:

  1. Is there any better way to handle this error? I am unable to find any consistent behavior.
  2. As I know, the batch.commit() timeout is 60s, how can I change it to a bigger value, also is this recommended?

Can someone please help?

Thank you in advanced! Sanket Kelkar

daniel-sanche commented 2 months ago

You should be able to pass in a different timeout value as part of batch.commit()