rabix / bunny

[Legacy] Executor for CWL workflows. Executes sbg:draft-2 and CWL 1.0
http://rabix.io
Apache License 2.0
74 stars 28 forks source link

TES backend fails citing s3 filesystem error #447

Closed kmavrommatis closed 5 years ago

kmavrommatis commented 5 years ago

Hi, I am trying to execute the example found in the rabix-1.0.5 release

Running it using LOCAL backend works as expected

./rabix examples/dna2protein/dna2protein.cwl.json examples/dna2protein/inputs.json --tes-url http://localhost:8000

When using funnel locally installed and I specify the backend.embedded.types=TES in core.properties I get the following error coming from rabix:

./rabix --tes-url http://localhost:8000  examples/dna2protein/dna2protein.cwl.json examples/dna2protein/inputs.json --tes-url http://localhost:8000
[2019-01-14 15:33:25.067] [ERROR] Encountered an error while starting local backend.
org.rabix.engine.service.BootstrapServiceException: com.google.inject.ProvisionException: Unable to provision, see the following errors:

1) Error injecting constructor, java.nio.file.FileSystemNotFoundException: S3 filesystem not yet created. Use newFileSystem() instead
  at org.rabix.backend.tes.service.impl.LocalTESStorageServiceImpl.<init>(LocalTESStorageServiceImpl.java:36)
  while locating org.rabix.backend.tes.service.impl.LocalTESStorageServiceImpl
  at org.rabix.backend.tes.TESModule.configure(TESModule.java:30) (via modules: org.rabix.cli.BackendCommandLine$1 -> org.rabix.backend.tes.TESModule)
  while locating org.rabix.backend.tes.service.TESStorageService
    for field at org.rabix.backend.tes.service.impl.LocalTESWorkerServiceImpl.storage(LocalTESWorkerServiceImpl.java:81)
  while locating org.rabix.backend.tes.service.impl.LocalTESWorkerServiceImpl

1 error
    at org.rabix.engine.service.impl.BootstrapServiceImpl.start(BootstrapServiceImpl.java:47) ~[rabix-cli.jar:na]
    at org.rabix.cli.BackendCommandLine.main(BackendCommandLine.java:398) ~[rabix-cli.jar:na]
Caused by: com.google.inject.ProvisionException: Unable to provision, see the following errors:

1) Error injecting constructor, java.nio.file.FileSystemNotFoundException: S3 filesystem not yet created. Use newFileSystem() instead
  at org.rabix.backend.tes.service.impl.LocalTESStorageServiceImpl.<init>(LocalTESStorageServiceImpl.java:36)
  while locating org.rabix.backend.tes.service.impl.LocalTESStorageServiceImpl
  at org.rabix.backend.tes.TESModule.configure(TESModule.java:30) (via modules: org.rabix.cli.BackendCommandLine$1 -> org.rabix.backend.tes.TESModule)
  while locating org.rabix.backend.tes.service.TESStorageService
    for field at org.rabix.backend.tes.service.impl.LocalTESWorkerServiceImpl.storage(LocalTESWorkerServiceImpl.java:81)
  while locating org.rabix.backend.tes.service.impl.LocalTESWorkerServiceImpl

1 error
    at com.google.inject.internal.Errors.throwProvisionExceptionIfErrorsExist(Errors.java:486) ~[rabix-cli.jar:na]
    at com.google.inject.internal.MembersInjectorImpl.injectMembers(MembersInjectorImpl.java:67) ~[rabix-cli.jar:na]
    at com.google.inject.internal.InjectorImpl.injectMembers(InjectorImpl.java:987) ~[rabix-cli.jar:na]
    at org.rabix.engine.service.impl.BackendServiceImpl.scanEmbedded(BackendServiceImpl.java:82) ~[rabix-cli.jar:na]
    at org.rabix.engine.service.impl.BootstrapServiceImpl.start(BootstrapServiceImpl.java:45) ~[rabix-cli.jar:na]
    ... 1 common frames omitted
Caused by: java.nio.file.FileSystemNotFoundException: S3 filesystem not yet created. Use newFileSystem() instead
    at com.upplication.s3fs.S3FileSystemProvider.getFileSystem(S3FileSystemProvider.java:280) ~[rabix-cli.jar:na]
    at com.upplication.s3fs.S3FileSystemProvider.getPath(S3FileSystemProvider.java:298) ~[rabix-cli.jar:na]
    at java.nio.file.Paths.get(Paths.java:143) ~[na:1.8.0_172]
    at org.rabix.backend.tes.service.impl.LocalTESStorageServiceImpl.<init>(LocalTESStorageServiceImpl.java:47) ~[rabix-cli.jar:na]
    at org.rabix.backend.tes.service.impl.LocalTESStorageServiceImpl$$FastClassByGuice$$148bf36a.newInstance(<generated>) ~[rabix-cli.jar:na]
    at com.google.inject.internal.DefaultConstructionProxyFactory$FastClassProxy.newInstance(DefaultConstructionProxyFactory.java:89) ~[rabix-cli.jar:na]
    at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:111) ~[rabix-cli.jar:na]
    at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:90) ~[rabix-cli.jar:na]
    at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:268) ~[rabix-cli.jar:na]
    at com.google.inject.internal.FactoryProxy.get(FactoryProxy.java:56) ~[rabix-cli.jar:na]
    at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46) ~[rabix-cli.jar:na]
    at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092) ~[rabix-cli.jar:na]
    at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40) ~[rabix-cli.jar:na]
    at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:194) ~[rabix-cli.jar:na]
    at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41) ~[rabix-cli.jar:na]
    at com.google.inject.internal.SingleFieldInjector.inject(SingleFieldInjector.java:54) ~[rabix-cli.jar:na]
    at com.google.inject.internal.MembersInjectorImpl.injectMembers(MembersInjectorImpl.java:132) ~[rabix-cli.jar:na]
    at com.google.inject.internal.MembersInjectorImpl$1.call(MembersInjectorImpl.java:93) ~[rabix-cli.jar:na]
    at com.google.inject.internal.MembersInjectorImpl$1.call(MembersInjectorImpl.java:80) ~[rabix-cli.jar:na]
    at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1085) ~[rabix-cli.jar:na]
    at com.google.inject.internal.MembersInjectorImpl.injectAndNotify(MembersInjectorImpl.java:80) ~[rabix-cli.jar:na]
    at com.google.inject.internal.MembersInjectorImpl.injectMembers(MembersInjectorImpl.java:62) ~[rabix-cli.jar:na]
    ... 4 common frames omitted

Rabix version: 1.0.5 Funnel version 0.8.0 Funnel has been tested with cwl-tes and works as expected. Thanks in advance for your help K

kmavrommatis commented 5 years ago

UPDATE Using the previous version of rabix (1.0.4) seems to be working.

adamstruck commented 5 years ago

What is the value of tes.storage_base in your config?

kmavrommatis commented 5 years ago

Hi, thanks for the quick response

tes.storage_base=s3://s3.us-east-1.com/bucket_name/funnel

(where bucket_name is an existing bucket)

I intent to use rabix + funnel with AWS batch and s3 but regardless of this in this particular case, why would pointing to a local TES url, without any input our output to S3 try to access s3 ?

K

adamstruck commented 5 years ago

Local inputs are staged to tes.storage_base since the funnel worker may be on a remote machine. The above error is due to Bunny being unable to find a matching s3 provider config (matches are based on the endpoint an S3 url). Your storage base has s3.us-east-1.com as the S3 endpoint. It should be s3.us-east-1.amazonaws.com to match the endpoint in the example S3 config.

Let me know if you have any issues getting everything setup on AWS Batch, I am happy to help if you need it!

adamstruck commented 5 years ago

Depending on your storage needs on AWS Batch you may be interested in:

https://github.com/adamstruck/ebsmount/tree/master/resources/funnel

This approach comes with its own caveats since there are limits to the number of disks that can be mounted per VM. You need to pick reasonable defaults for resource requests and allowed VM sizes.

kmavrommatis commented 5 years ago

Thanks, I must be missing something: I modified the line of core.properties to use the relevant endpoint: tes.storage_base=s3://s3.us-east-1.amazonaws.com/bucket/funnel. (I thought I had previously tried it but apparently not. ). I also tried s3://s3.amazonaws.com/bucket/funnel as endpoint and https://s3.amazonaws.com, https://s3.us-east-1.amazonaws.com/ Based on #407 I also tried s3:///bucket/funnel and modified the inputs.json file to match the location on s3, but got the same error. FInally, I tried to give the bucket name as is without any 'subdirectory' in it, without any luck in all cases I run the same command for the dna2protein example

/rabix examples/dna2protein/dna2protein.cwl.json examples/dna2protein/inputs.json --tes-url http://localhost:8000

and I get the error:

[2019-01-15 12:51:48.041] [ERROR] Failed to retrieve TESTask
java.util.concurrent.ExecutionException: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: B84F4EE4D95F2B00; S3 Extended Request ID: T0TJPeXHXbpS+wumAxTab/Co2N+RjTTJiBz04Cl2TMeeK+j1+P2RZDdaur0f0rEqhtVkyW6MKEU=), S3 Extended Request ID: T0TJPeXHXbpS+wumAxTab/Co2N+RjTTJiBz04Cl2TMeeK+j1+P2RZDdaur0f0rEqhtVkyW6MKEU=
    at java.util.concurrent.FutureTask.report(FutureTask.java:122) [na:1.8.0_172]
    at java.util.concurrent.FutureTask.get(FutureTask.java:192) [na:1.8.0_172]
    at org.rabix.backend.tes.service.impl.LocalTESWorkerServiceImpl$1.run(LocalTESWorkerServiceImpl.java:173) ~[rabix-cli.jar:na]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_172]
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_172]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_172]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_172]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_172]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_172]
    at java.lang.Thread.run(Thread.java:748) [na:1.8.0_172]
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: B84F4EE4D95F2B00; S3 Extended Request ID: T0TJPeXHXbpS+wumAxTab/Co2N+RjTTJiBz04Cl2TMeeK+j1+P2RZDdaur0f0rEqhtVkyW6MKEU=)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) ~[rabix-cli.jar:na]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) ~[rabix-cli.jar:na]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056) ~[rabix-cli.jar:na]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) ~[rabix-cli.jar:na]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) ~[rabix-cli.jar:na]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) ~[rabix-cli.jar:na]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) ~[rabix-cli.jar:na]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) ~[rabix-cli.jar:na]
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) ~[rabix-cli.jar:na]
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4319) ~[rabix-cli.jar:na]
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4266) ~[rabix-cli.jar:na]
    at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1261) ~[rabix-cli.jar:na]
    at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1236) ~[rabix-cli.jar:na]
    at com.upplication.s3fs.util.S3Utils.getS3ObjectSummary(S3Utils.java:37) ~[rabix-cli.jar:na]
    at com.upplication.s3fs.S3FileSystemProvider.exists(S3FileSystemProvider.java:612) ~[rabix-cli.jar:na]
    at com.upplication.s3fs.S3FileSystemProvider.createDirectory(S3FileSystemProvider.java:366) ~[rabix-cli.jar:na]
    at java.nio.file.Files.createDirectory(Files.java:674) ~[na:1.8.0_172]
    at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) ~[na:1.8.0_172]
    at java.nio.file.Files.createDirectories(Files.java:727) ~[na:1.8.0_172]
    at org.rabix.backend.tes.service.impl.LocalTESWorkerServiceImpl$TaskRunCallable.stageFileRequirements(LocalTESWorkerServiceImpl.java:385) ~[rabix-cli.jar:na]
    at org.rabix.backend.tes.service.impl.LocalTESWorkerServiceImpl$TaskRunCallable.call(LocalTESWorkerServiceImpl.java:256) ~[rabix-cli.jar:na]
    at org.rabix.backend.tes.service.impl.LocalTESWorkerServiceImpl$TaskRunCallable.call(LocalTESWorkerServiceImpl.java:225) ~[rabix-cli.jar:na]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_172]
    ... 3 common frames omitted
[2019-01-15 12:51:48.044] [INFO] com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: B84F4EE4D95F2B00; S3 Extended Request ID: T0TJPeXHXbpS+wumAxTab/Co2N+RjTTJiBz04Cl2TMeeK+j1+P2RZDdaur0f0rEqhtVkyW6MKEU=), S3 Extended Request ID: T0TJPeXHXbpS+wumAxTab/Co2N+RjTTJiBz04Cl2TMeeK+j1+P2RZDdaur0f0rEqhtVkyW6MKEU=
adamstruck commented 5 years ago

Did you set a valid access_key and secret_key?

I was able to reproduce the above when those values were left as the defaults in the config:

s3.amazon.access_key=***************************
s3.amazon.secret_key=****************************************
kmavrommatis commented 5 years ago

Hi, thanks for the response, unfortunately still stuck

I have tried it both ways: I have set the access key and secret key in core.properties. and I also tried to leave those lines with the default values (or commented out them entirely) since aws sdk can take the values from the ~/.aws/credentials, or set the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables but the result was the same.

Thinking that this may be due to encryption (our buckets require sse AES256) i also experimented with a bucket with no encryption requirements whatsoever, but the output is still the same.

adamstruck commented 5 years ago

Try commenting out the s3.amazon.signer_override in your config.

kmavrommatis commented 5 years ago

Thanks for the suggestion. That worked !! Could you explain what this actually does? Thanks again for your help

adamstruck commented 5 years ago

The signer type is what determines what format of authorization header is used. The default should have been AWSS3V4SignerType, not AWSV4SignerType. You can read a bit about it here.

This is exposed as an option in the config since some S3 providers don't support version 4 signatures.