GoogleCloudPlatform / data-science-on-gcp

Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Apache License 2.0
1.31k stars 715 forks source link

08_dataflow/create_datasets.sh failed with java.lang.reflect.InvocationTargetException #42

Closed myoshimu closed 4 years ago

myoshimu commented 5 years ago

Running create_datasets.sh in cloud shell failed with the following error. Is this something related to my environment or is there any way to resolve the issue?

$ ./create_datasets.sh <BUCKET> 3
CommandException: 1 files/objects could not be removed.
[INFO] Scanning for projects...
[INFO]
[INFO] -------------< com.google.cloud.training.flights:chapter8 >-------------
[INFO] Building chapter8 [1.0.0,2.0.0]
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ chapter8 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/user/data-science-on-gcp/08_dataflow/chapter8/src/main/resources
[INFO]
[INFO] --- maven-compiler-plugin:3.5.1:compile (default-compile) @ chapter8 ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- exec-maven-plugin:1.4.0:java (default-cli) @ chapter8 ---
[WARNING]
java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:293)
    at java.lang.Thread.run (Thread.java:748)
Caused by: java.lang.RuntimeException: Failed to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions)
    at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod (InstanceBuilder.java:233)
    at org.apache.beam.sdk.util.InstanceBuilder.build (InstanceBuilder.java:162)
    at org.apache.beam.sdk.PipelineRunner.fromOptions (PipelineRunner.java:55)
    at org.apache.beam.sdk.Pipeline.create (Pipeline.java:150)
    at com.google.cloud.training.flights.CreateTrainingDataset.main (CreateTrainingDataset.java:95)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:293)
    at java.lang.Thread.run (Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod (InstanceBuilder.java:222)
    at org.apache.beam.sdk.util.InstanceBuilder.build (InstanceBuilder.java:162)
    at org.apache.beam.sdk.PipelineRunner.fromOptions (PipelineRunner.java:55)
    at org.apache.beam.sdk.Pipeline.create (Pipeline.java:150)
    at com.google.cloud.training.flights.CreateTrainingDataset.main (CreateTrainingDataset.java:95)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:293)
    at java.lang.Thread.run (Thread.java:748)
Caused by: java.lang.NoSuchMethodError: com.google.api.client.googleapis.services.json.AbstractGoogleJsonClient$Builder.setBatchPath(Ljava/lang/String;)Lcom/google/api/client/googleapis/services/AbstractGoogleClient$Builder;
    at com.google.api.services.storage.Storage$Builder.setBatchPath (Storage.java:9307)
    at com.google.api.services.storage.Storage$Builder.<init> (Storage.java:9286)
    at org.apache.beam.sdk.util.Transport.newStorageClient (Transport.java:95)
    at org.apache.beam.sdk.util.GcsUtil$GcsUtilFactory.create (GcsUtil.java:96)
    at org.apache.beam.sdk.util.GcsUtil$GcsUtilFactory.create (GcsUtil.java:84)
    at org.apache.beam.sdk.options.ProxyInvocationHandler.returnDefaultHelper (ProxyInvocationHandler.java:592)
    at org.apache.beam.sdk.options.ProxyInvocationHandler.getDefault (ProxyInvocationHandler.java:533)
    at org.apache.beam.sdk.options.ProxyInvocationHandler.invoke (ProxyInvocationHandler.java:155)
    at com.sun.proxy.$Proxy47.getGcsUtil (Unknown Source)
    at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.verifyPathIsAccessible (GcsPathValidator.java:88)
    at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.validateOutputFilePrefixSupported (GcsPathValidator.java:61)
    at org.apache.beam.sdk.extensions.gcp.options.GcpOptions$GcpTempLocationFactory.create (GcpOptions.java:245)
    at org.apache.beam.sdk.extensions.gcp.options.GcpOptions$GcpTempLocationFactory.create (GcpOptions.java:228)
    at org.apache.beam.sdk.options.ProxyInvocationHandler.returnDefaultHelper (ProxyInvocationHandler.java:592)
    at org.apache.beam.sdk.options.ProxyInvocationHandler.getDefault (ProxyInvocationHandler.java:533)
    at org.apache.beam.sdk.options.ProxyInvocationHandler.invoke (ProxyInvocationHandler.java:155)
    at com.sun.proxy.$Proxy38.getGcpTempLocation (Unknown Source)
    at org.apache.beam.runners.dataflow.DataflowRunner.fromOptions (DataflowRunner.java:240)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod (InstanceBuilder.java:222)
    at org.apache.beam.sdk.util.InstanceBuilder.build (InstanceBuilder.java:162)
    at org.apache.beam.sdk.PipelineRunner.fromOptions (PipelineRunner.java:55)
    at org.apache.beam.sdk.Pipeline.create (Pipeline.java:150)
    at com.google.cloud.training.flights.CreateTrainingDataset.main (CreateTrainingDataset.java:95)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:293)
    at java.lang.Thread.run (Thread.java:748)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  11.681 s
[INFO] Finished at: 2019-02-05T16:13:20+09:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.4.0:java (default-cli) on project chapter8: An exception occured while executing the Java class. null: InvocationTargetException: Failed to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions): com.google.api.client.googleapis.services.json.AbstractGoogleJsonClient$Builder.setBatchPath(Ljava/lang/String;)Lcom/google/api/client/googleapis/services/AbstractGoogleClient$Builder; -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
myoshimu commented 5 years ago

$ sudo gcloud components update

All components are up to date. $ java -version openjdk version "1.8.0_181" OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13) OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

pagestarr66 commented 5 years ago

I got this working by changing the version of cloud dataflow to 2.4.0- note however that dataflow is deprecated by end of 2019. See below for lines 74-78 from my pom.xml:

<dependency>
  <groupId>com.google.cloud.dataflow</groupId>
  <artifactId>google-cloud-dataflow-java-sdk-all</artifactId>
  <version>2.4.0</version>
</dependency>
lakshmanok commented 5 years ago

Could you try with the pom.xml from here: https://github.com/GoogleCloudPlatform/training-data-analyst/blob/ml-dataflow-quest/quests/dataflow/4_Streaming_Analytics/solution/pom.xml specifically:

<beam.version>2.12.0</beam.version>
...
<dependency>
      <groupId>org.apache.beam</groupId>
      <artifactId>beam-sdks-java-core</artifactId>
      <version>${beam.version}</version>
    </dependency>
    <dependency>
      <groupId>org.apache.beam</groupId>
      <artifactId>beam-sdks-java-io-google-cloud-platform</artifactId>
      <version>${beam.version}</version>
      <exclusions>
        <exclusion>
          <groupId>junit</groupId>
          <artifactId>junit</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.apache.beam</groupId>
      <artifactId>beam-sdks-java-extensions-google-cloud-platform-core</artifactId>
      <version>${beam.version}</version>
    </dependency>
    <dependency>
      <groupId>org.apache.beam</groupId>
      <artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
      <version>${beam.version}</version>
    </dependency>
    <dependency>
      <groupId>org.apache.beam</groupId>
      <artifactId>beam-runners-direct-java</artifactId>
      <version>${beam.version}</version>
    </dependency>