spotify / spark-bigquery

Google BigQuery support for Spark, SQL, and DataFrames
Apache License 2.0
155 stars 52 forks source link

saveAsBigQueryTable() method returns NoSuchMethodError #62

Open pranay29 opened 6 years ago

pranay29 commented 6 years ago

I am running my spark-shell with Scala on version 2.2.1

spark-shell --packages com.spotify:spark-bigquery_2.10:0.2.0

scala> sqlContext.setGcpJsonKeyFile(file_path)

scala> sqlContext.setBigQueryProjectId("proj")

scala> sqlContext.setBigQueryGcsBucket("dummy_bucket")

scala> sqlContext.setBigQueryDatasetLocation("US")

I am trying to load some data in BigQuery which returns an error as shown below--

scala> val df = Seq((1,1,1), (2,2,2)).toDF("A","B","C")

scala> df.show
+---+---+---+
|  A|  B|  C|
+---+---+---+
|  1|  1|  1|
|  2|  2|  2|
+---+---+---+

scala> df.saveAsBigQueryTable("proj:dataset_name.table_name")
java.lang.NoSuchMethodError: com.google.common.base.Splitter.splitToList(Ljava/lang/CharSequence;)Ljava/util/List;
  at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase$ParentTimestampUpdateIncludePredicate.create(GoogleHadoopFileSystemBase.java:572)
  at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.createOptionsBuilderFromConfig(GoogleHadoopFileSystemBase.java:1890)
  at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.configure(GoogleHadoopFileSystemBase.java:1587)
  at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:793)
  at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:756)
  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
  at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
  at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
  at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
  at org.apache.spark.sql.execution.datasources.DataSource.writeInFileFormat(DataSource.scala:394)
  at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:471)
  at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:50)
  at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
  at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
  at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
  at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
  at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
  at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
  at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
  at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
  at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:609)
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233)
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:217)
  at com.databricks.spark.avro.package$AvroDataFrameWriter$$anonfun$avro$1.apply(package.scala:26)
  at com.databricks.spark.avro.package$AvroDataFrameWriter$$anonfun$avro$1.apply(package.scala:26)
  at com.spotify.spark.bigquery.package$BigQueryDataFrame.saveAsBigQueryTable(package.scala:159)
  at com.spotify.spark.bigquery.package$BigQueryDataFrame.saveAsBigQueryTable(package.scala:171)
  ... 50 elided

Kindly help !!

chitralverma commented 6 years ago

Hi @pranay29 did you get around this ?

pranay29 commented 6 years ago

Hi @chitralverma ,

I still have not got around this. I don't know if its fixed yet.

nevillelyh commented 6 years ago

It's the classic guava version conflict. You can probably workaround it by pinging guava and/or other dependency versions.

chitralverma commented 6 years ago

seems to be coming from spark 2.2.1 d3pendency, can you try it with 2.2.0

On Tue, Apr 17, 2018, 12:25 Neville Li notifications@github.com wrote:

It's the classic guava version conflict. You can probably workaround it by pinging guava and/or other dependency versions.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/spotify/spark-bigquery/issues/62#issuecomment-382112302, or mute the thread https://github.com/notifications/unsubscribe-auth/AKnoOM_LW_Nxc7I_JEax3CGPgpJWbLxmks5tpkG9gaJpZM4ScjEF .

pranay29 commented 6 years ago

I am not sure if I change the existing Spark version since its in production environment.

sunilpashikanti commented 6 years ago

My spark verion is 2.2.1 run in to tis error kindly help

spark-shell --packages com.github.samelamin:spark-bigquery_2.11:0.2.4

import com.samelamin.spark.bigquery._

df.saveAsBigQueryTable("myproj:dataset.target1") java.lang.NoSuchMethodError: com.google.cloud.hadoop.io.bigquery.BigQueryStrings.parseTableReference(Ljava/lang/String;)Lcom/google/api/services/bigquery/model/TableReference; at com.samelamin.spark.bigquery.BigQueryDataFrame.saveAsBigQueryTable(BigQueryDataFrame.scala:40) ... 50 elided

mikerlt commented 4 years ago

has anyone found a work around for this?

sunilpashikanti commented 4 years ago

I have found it. Year ago

On Tue, Sep 17, 2019, 10:06 PM mikerlt notifications@github.com wrote:

has anyone found a work around for this?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AIMRWXAJQMJ2GFGBLVWPJSTQKEBSBA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65EFLY#issuecomment-532300463, or mute the thread https://github.com/notifications/unsubscribe-auth/AIMRWXF2RFLJ72MYWEKI3LLQKEBSBANCNFSM4ETSGECQ .

mikerlt commented 4 years ago

Ok, and what did you do to get it to work?

I have found it. Year ago On Tue, Sep 17, 2019, 10:06 PM mikerlt @.***> wrote: has anyone found a work around for this? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#62?email_source=notifications&email_token=AIMRWXAJQMJ2GFGBLVWPJSTQKEBSBA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65EFLY#issuecomment-532300463>, or mute the thread https://github.com/notifications/unsubscribe-auth/AIMRWXF2RFLJ72MYWEKI3LLQKEBSBANCNFSM4ETSGECQ .

sunilpashikanti commented 4 years ago

You have to add dependent jar file to the spark library

On Tue, Sep 17, 2019, 10:17 PM mikerlt notifications@github.com wrote:

Ok, and what did you do to get it to work?

I have found it. Year ago … <#m-9049897132146471791> On Tue, Sep 17, 2019, 10:06 PM mikerlt @.***> wrote: has anyone found a work around for this? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#62 https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AIMRWXAJQMJ2GFGBLVWPJSTQKEBSBA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65EFLY#issuecomment-532300463>, or mute the thread https://github.com/notifications/unsubscribe-auth/AIMRWXF2RFLJ72MYWEKI3LLQKEBSBANCNFSM4ETSGECQ .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AIMRWXAMODZNKBJ7RRMMKDLQKEC3RA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65FFJY#issuecomment-532304551, or mute the thread https://github.com/notifications/unsubscribe-auth/AIMRWXH4EWE5MP2OL5X2NODQKEC3RANCNFSM4ETSGECQ .

mikerlt commented 4 years ago

Do you know which jar needs to be added?

On Wed, Sep 18, 2019 at 4:13 AM sunilpashikanti notifications@github.com wrote:

You have to add dependent jar file to the spark library

On Tue, Sep 17, 2019, 10:17 PM mikerlt notifications@github.com wrote:

Ok, and what did you do to get it to work?

I have found it. Year ago … <#m-9049897132146471791> On Tue, Sep 17, 2019, 10:06 PM mikerlt @.***> wrote: has anyone found a work around for this? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#62 https://github.com/spotify/spark-bigquery/issues/62 ?email_source=notifications&email_token=AIMRWXAJQMJ2GFGBLVWPJSTQKEBSBA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65EFLY#issuecomment-532300463, or mute the thread

https://github.com/notifications/unsubscribe-auth/AIMRWXF2RFLJ72MYWEKI3LLQKEBSBANCNFSM4ETSGECQ .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AIMRWXAMODZNKBJ7RRMMKDLQKEC3RA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65FFJY#issuecomment-532304551 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AIMRWXH4EWE5MP2OL5X2NODQKEC3RANCNFSM4ETSGECQ

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AINNSFWVHDJK35X6EQZJYQ3QKHPMHA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD67G54I#issuecomment-532573937, or mute the thread https://github.com/notifications/unsubscribe-auth/AINNSFWVJ6JMQOL7BR6VAPDQKHPMHANCNFSM4ETSGECQ .

-- Mike Jones CIO mike.jones@rootleveltech.com | 281.825.3801 Root Level Technology | http://www.rootleveltech.com 20008 Champion Forest Dr. STE 103 Spring, TX 77379

https://app.cloudphysics.com/RootLevel+GCP/getStarted

The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure. If you are not the intended recipient of this message or their agent, or if this message has been addressed to you in error, please immediately alert the sender by reply email and then delete this message and any attachments. If you are not the intended recipient, you are hereby notified that any use, dissemination, copying, or storage of this message or its attachments is strictly prohibited.

sunilpashikanti commented 4 years ago

I have done it months ago. Let me check and let you know.

On Wed, Sep 18, 2019, 5:25 PM mikerlt notifications@github.com wrote:

Do you know which jar needs to be added?

On Wed, Sep 18, 2019 at 4:13 AM sunilpashikanti notifications@github.com wrote:

You have to add dependent jar file to the spark library

On Tue, Sep 17, 2019, 10:17 PM mikerlt notifications@github.com wrote:

Ok, and what did you do to get it to work?

I have found it. Year ago … <#m-9049897132146471791> On Tue, Sep 17, 2019, 10:06 PM mikerlt @.***> wrote: has anyone found a work around for this? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#62 <https://github.com/spotify/spark-bigquery/issues/62

?email_source=notifications&email_token=AIMRWXAJQMJ2GFGBLVWPJSTQKEBSBA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65EFLY#issuecomment-532300463>,

or mute the thread

https://github.com/notifications/unsubscribe-auth/AIMRWXF2RFLJ72MYWEKI3LLQKEBSBANCNFSM4ETSGECQ

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AIMRWXAMODZNKBJ7RRMMKDLQKEC3RA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65FFJY#issuecomment-532304551

, or mute the thread <

https://github.com/notifications/unsubscribe-auth/AIMRWXH4EWE5MP2OL5X2NODQKEC3RANCNFSM4ETSGECQ

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AINNSFWVHDJK35X6EQZJYQ3QKHPMHA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD67G54I#issuecomment-532573937 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AINNSFWVJ6JMQOL7BR6VAPDQKHPMHANCNFSM4ETSGECQ

.

-- Mike Jones CIO mike.jones@rootleveltech.com | 281.825.3801 Root Level Technology | http://www.rootleveltech.com 20008 Champion Forest Dr. STE 103 Spring, TX 77379

https://app.cloudphysics.com/RootLevel+GCP/getStarted

The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure. If you are not the intended recipient of this message or their agent, or if this message has been addressed to you in error, please immediately alert the sender by reply email and then delete this message and any attachments. If you are not the intended recipient, you are hereby notified that any use, dissemination, copying, or storage of this message or its attachments is strictly prohibited.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AIMRWXFAD2RH75FQZTJVRILQKIJK3A5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD67ZW6A#issuecomment-532650872, or mute the thread https://github.com/notifications/unsubscribe-auth/AIMRWXFTEQO4E62MFGMLB33QKIJK3ANCNFSM4ETSGECQ .

sunilpashikanti commented 4 years ago

Bigquery-connector-latest-hadoop2.jar

On Wed, Sep 18, 2019, 11:01 PM Pashikanti Sunil < sunilpashikanti2000@gmail.com> wrote:

I have done it months ago. Let me check and let you know.

On Wed, Sep 18, 2019, 5:25 PM mikerlt notifications@github.com wrote:

Do you know which jar needs to be added?

On Wed, Sep 18, 2019 at 4:13 AM sunilpashikanti <notifications@github.com

wrote:

You have to add dependent jar file to the spark library

On Tue, Sep 17, 2019, 10:17 PM mikerlt notifications@github.com wrote:

Ok, and what did you do to get it to work?

I have found it. Year ago … <#m-9049897132146471791> On Tue, Sep 17, 2019, 10:06 PM mikerlt @.***> wrote: has anyone found a work around for this? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#62 <https://github.com/spotify/spark-bigquery/issues/62

?email_source=notifications&email_token=AIMRWXAJQMJ2GFGBLVWPJSTQKEBSBA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65EFLY#issuecomment-532300463>,

or mute the thread

https://github.com/notifications/unsubscribe-auth/AIMRWXF2RFLJ72MYWEKI3LLQKEBSBANCNFSM4ETSGECQ

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AIMRWXAMODZNKBJ7RRMMKDLQKEC3RA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65FFJY#issuecomment-532304551

, or mute the thread <

https://github.com/notifications/unsubscribe-auth/AIMRWXH4EWE5MP2OL5X2NODQKEC3RANCNFSM4ETSGECQ

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AINNSFWVHDJK35X6EQZJYQ3QKHPMHA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD67G54I#issuecomment-532573937 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AINNSFWVJ6JMQOL7BR6VAPDQKHPMHANCNFSM4ETSGECQ

.

-- Mike Jones CIO mike.jones@rootleveltech.com | 281.825.3801 Root Level Technology | http://www.rootleveltech.com 20008 Champion Forest Dr. STE 103 Spring, TX 77379

https://app.cloudphysics.com/RootLevel+GCP/getStarted

The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure. If you are not the intended recipient of this message or their agent, or if this message has been addressed to you in error, please immediately alert the sender by reply email and then delete this message and any attachments. If you are not the intended recipient, you are hereby notified that any use, dissemination, copying, or storage of this message or its attachments is strictly prohibited.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AIMRWXFAD2RH75FQZTJVRILQKIJK3A5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD67ZW6A#issuecomment-532650872, or mute the thread https://github.com/notifications/unsubscribe-auth/AIMRWXFTEQO4E62MFGMLB33QKIJK3ANCNFSM4ETSGECQ .

mikerlt commented 4 years ago

Thank you!!

On Sep 18, 2019, at 1:36 PM, sunilpashikanti notifications@github.com wrote:

Bigquery-connector-latest-hadoop2.jar

On Wed, Sep 18, 2019, 11:01 PM Pashikanti Sunil < sunilpashikanti2000@gmail.com> wrote:

I have done it months ago. Let me check and let you know.

On Wed, Sep 18, 2019, 5:25 PM mikerlt notifications@github.com wrote:

Do you know which jar needs to be added?

On Wed, Sep 18, 2019 at 4:13 AM sunilpashikanti <notifications@github.com

wrote:

You have to add dependent jar file to the spark library

On Tue, Sep 17, 2019, 10:17 PM mikerlt notifications@github.com wrote:

Ok, and what did you do to get it to work?

I have found it. Year ago … <#m-9049897132146471791> On Tue, Sep 17, 2019, 10:06 PM mikerlt @.***> wrote: has anyone found a work around for this? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#62 <https://github.com/spotify/spark-bigquery/issues/62

?email_source=notifications&email_token=AIMRWXAJQMJ2GFGBLVWPJSTQKEBSBA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65EFLY#issuecomment-532300463>,

or mute the thread

https://github.com/notifications/unsubscribe-auth/AIMRWXF2RFLJ72MYWEKI3LLQKEBSBANCNFSM4ETSGECQ

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AIMRWXAMODZNKBJ7RRMMKDLQKEC3RA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65FFJY#issuecomment-532304551

, or mute the thread <

https://github.com/notifications/unsubscribe-auth/AIMRWXH4EWE5MP2OL5X2NODQKEC3RANCNFSM4ETSGECQ

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AINNSFWVHDJK35X6EQZJYQ3QKHPMHA5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD67G54I#issuecomment-532573937 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AINNSFWVJ6JMQOL7BR6VAPDQKHPMHANCNFSM4ETSGECQ

.

-- Mike Jones CIO mike.jones@rootleveltech.com | 281.825.3801 Root Level Technology | http://www.rootleveltech.com 20008 Champion Forest Dr. STE 103 Spring, TX 77379

https://app.cloudphysics.com/RootLevel+GCP/getStarted

The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure. If you are not the intended recipient of this message or their agent, or if this message has been addressed to you in error, please immediately alert the sender by reply email and then delete this message and any attachments. If you are not the intended recipient, you are hereby notified that any use, dissemination, copying, or storage of this message or its attachments is strictly prohibited.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/spotify/spark-bigquery/issues/62?email_source=notifications&email_token=AIMRWXFAD2RH75FQZTJVRILQKIJK3A5CNFSM4ETSGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD67ZW6A#issuecomment-532650872, or mute the thread https://github.com/notifications/unsubscribe-auth/AIMRWXFTEQO4E62MFGMLB33QKIJK3ANCNFSM4ETSGECQ .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.