apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.89k stars 4.27k forks source link

BigQuery DIRECT_READ does not validate pipeline's project ID and instead tries to read from a null project #21507

Open damccorm opened 2 years ago

damccorm commented 2 years ago

When a pipeline is created without a GCP project ID and tries to read from BigQuery using Storage Read API, it runs into the following unhelpful error:


org.apache.beam.sdk.Pipeline$PipelineExecutionException: com.google.api.gax.rpc.PermissionDeniedException:
io.grpc.StatusRuntimeException: PERMISSION_DENIED: BigQuery Storage API has not been used in project
770406736630 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/bigquerystorage.googleapis.com/overview?project=770406736630
then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our
systems and retry.

It looks like no validation for project ID is happening, and Beam tries to read without a project ID. Project 770406736630 mentioned in the error is a null project and throws off the user because it isn't their project.

 

Doing the same but using the EXPORT read method results in this more helpful error.


org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.NullPointerException: Required parameter
projectId must be specified.

 

Imported from Jira BEAM-14119. Original Jira may contain additional context. Reported by: ahmedabu.

Amar3tto commented 5 months ago

Looks like this issue is outdated and can be closed.