GoogleCloudPlatform / google-cloud-eclipse

Google Cloud Platform plugin for Eclipse
Apache License 2.0
86 stars 49 forks source link

Dataflow bucket creation #3216

Open elharo opened 6 years ago

elharo commented 6 years ago

Problem using newly created bucket during test plan. Investigating...

Exception in thread "main" java.lang.RuntimeException: Failed to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions)
    at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:233)
    at org.apache.beam.sdk.util.InstanceBuilder.build(InstanceBuilder.java:162)
    at org.apache.beam.sdk.PipelineRunner.fromOptions(PipelineRunner.java:55)
    at org.apache.beam.sdk.Pipeline.create(Pipeline.java:150)
    at com.example.StarterPipeline.main(StarterPipeline.java:50)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:222)
    ... 4 more
Caused by: java.lang.IllegalArgumentException: Missing object or bucket in path: 'gs://bar45679/', did you mean: 'gs://some-bucket/bar45679'?
    at org.apache.beam.repackaged.beam_sdks_java_extensions_google_cloud_platform_core.com.google.common.base.Preconditions.checkArgument(Preconditions.java:383)
    at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.verifyPath(GcsPathValidator.java:77)
    at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.validateOutputFilePrefixSupported(GcsPathValidator.java:60)
    at org.apache.beam.runners.dataflow.DataflowRunner.fromOptions(DataflowRunner.java:245)
    ... 9 more
elharo commented 6 years ago

wondering if something changed recently in bucket validation:

  @Override
  public String verifyPath(String path) {
    GcsPath gcsPath = getGcsPath(path);
    checkArgument(gcsPath.isAbsolute(), "Must provide absolute paths for Dataflow");
    checkArgument(!gcsPath.getObject().isEmpty(),
        "Missing object or bucket in path: '%s', did you mean: 'gs://some-bucket/%s'?",
        gcsPath, gcsPath.getBucket());
    checkArgument(!gcsPath.getObject().contains("//"),
        "Dataflow Service does not allow objects with consecutive slashes");
    return gcsPath.toResourceName();
  }
elharo commented 6 years ago

Nope, that hasn't changed in a while. Maybe GcsPath?

chanseokoh commented 6 years ago

You should always specify a subfolder: gs://bucket/some-folder/ There was an issue in our repo closed as WAI.

elharo commented 6 years ago

OK, the UI here is confusing then. We should disable the run button and probably put up an error decorator or message until the user has typed in a full path.