samelamin / spark-bigquery

Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
Apache License 2.0
70 stars 28 forks source link

Error when changing zone to something other than EU/US #69

Closed ghost closed 5 years ago

ghost commented 5 years ago

When you set the zone to the following value: europe-north1 i get the error as listed below. Is this a bug or are zones just not supported (yet)?

Exception in thread "main" com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
{
  "code" : 404,
  "errors" : [ {
    "domain" : "global",
    "message" : "Not found: Job project-name:project-name-ecc57416-1698-45b4-a1d3-cb7d2c88c356",
    "reason" : "notFound"
  } ],
  "message" : "Not found: Job project-name:project-name-ecc57416-1698-45b4-a1d3-cb7d2c88c356",
  "status" : "NOT_FOUND"
}
    at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
    at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
    at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321)
    at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1065)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
    at com.google.cloud.hadoop.util.ResilientOperation$AbstractGoogleClientRequestExecutor.call(ResilientOperation.java:164)
    at com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:64)
    at com.google.cloud.hadoop.io.bigquery.BigQueryUtils.waitForJobCompletion(BigQueryUtils.java:95)
    at com.samelamin.spark.bigquery.BigQueryClient.com$samelamin$spark$bigquery$BigQueryClient$$waitForJob(BigQueryClient.scala:153)
    at com.samelamin.spark.bigquery.BigQueryClient.load(BigQueryClient.scala:116)
    at com.samelamin.spark.bigquery.BigQueryDataFrame.saveAsBigQueryTable(BigQueryDataFrame.scala:43)
samelamin commented 5 years ago

zones are just not supported, feel free to send a pr in though 👍

alphacr commented 5 years ago

@zwennesm I'm having the same problem and it seems that the library didn't add the europe-north1 string to the job's name when it's trying to locate it.

I'm currently trying to load tables into BigQuery and the job was able to run but I received the same error message due to the job name being different from those that is actually being run in BigQuery.

Whenever I'm trying to load a table from BigQuery the same error also shows as the library is not able to create a job to load the staging table from BigQuery to GCS. I was able to mitigate this load problem by using sqlContext.bigQueryTable("project_id:dataset.table") instead.

Martijn, If you were able to find a workaround on your problem, I hope you'll be able to share it as well.

ghost commented 5 years ago

I started out with the same fix as you mentioned, but I ended up moving to the Spotify library due to issues with regions and running the application in Google Dataproc.