nextflow-io / nextflow-s3fs

An S3 File System Provider for Java 7 (project archived)
Apache License 2.0
1 stars 10 forks source link

Include SignerOverride for S3 Filesystem #9

Closed lukasjelonek closed 6 years ago

lukasjelonek commented 6 years ago

Hey,

I'd like to use the s3 support with a self hosted s3 provider, based on ceph and radosgw. Unfortunately the authentication requires a different signertype than the default one. I can't override the signertype in nextflow.

I have successfully implemented it with the latests Amazon-S3-FileSystem-NIO2 library. See below:

package bio.comp.jlu.psos;

import com.amazonaws.ClientConfiguration;
import com.amazonaws.auth.AWSCredentialsProvider;
import com.amazonaws.metrics.RequestMetricCollector;
import com.amazonaws.services.s3.AmazonS3;
import com.upplication.s3fs.AmazonS3ClientFactory;

/**
 *
 */
public class CephAmazonS3Factory extends AmazonS3ClientFactory {

    @Override
    protected AmazonS3 createAmazonS3(AWSCredentialsProvider credentialsProvider, ClientConfiguration clientConfiguration, RequestMetricCollector requestMetricsCollector) {
        clientConfiguration.setSignerOverride("S3SignerType");
        return super.createAmazonS3(credentialsProvider,clientConfiguration, requestMetricsCollector);
    }

}

and I can use it with the following code:

Map<String, ?> env = ImmutableMap.<String, Object>builder()
  .put(com.upplication.s3fs.AmazonS3Factory.ACCESS_KEY, accessKey)
  .put(com.upplication.s3fs.S3FileSystemProvider.AMAZON_S3_FACTORY_CLASS, "bio.comp.jlu.psos.CephAmazonS3Factory")
  .put(com.upplication.s3fs.AmazonS3Factory.SECRET_KEY, secretKey).build();
FileSystem fs = FileSystems.newFileSystem(new URI("s3://"+host+"/"), env, Thread.currentThread().getContextClassLoader());
fs.getRootDirectories().forEach(System.out::println);
Path path = fs.getPath("/psos/Cloud-50.png");
System.out.println(Files.exists(path));

I checked if I can set the SignerOverride in nextflow-s3, but unfortunately you use a very old version of the aws-java-sdk that lacks an option to set this attribute. I checked out your code and updated the aws-java-sdk to the latest version and it compiled, but a few tests failed. I have seen that Amazon-S3-FileSystem is based on a newer aws-java-sdk, but your fork diverged from them and can't be merged automatically. So it's non-trivial to implement my request.

Is it intented to update the s3 filesystem to the latest changes in upstream? Or would it be sufficient to update the aws-java-sdk to the latest version?

pditommaso commented 6 years ago

The tests in nextflow-s3fs are not reliable (we plan to rewrite the library from scratch). If the problem is only the version of the aws sdk, it should be able to update. Feel free to submit a PR.

lukasjelonek commented 6 years ago

Allright, I will try that on monday

lukasjelonek commented 6 years ago

Hey,

I checked the source code and I am not sure how to start. Nextflow references the artifact 'io.nextflow:nxf-s3fs:1.0.1', but in the master of this project I only see 'com.upplication:s3fs:0.2.8'. Where do I find the actual 'io.nextflow:nxf-s3fs:1.0.1' library code?

pditommaso commented 6 years ago

here it is

https://github.com/nextflow-io/nextflow-s3fs

lukasjelonek commented 6 years ago

But at some point you must have changed the groupid from com.upplication to io.nextflow, but I don't find any branch that contain io.nextflow as the groupId in the pom.xml

pditommaso commented 6 years ago

Oh, I see. I've never updated it. I'm using the gradle build for this project.

lukasjelonek commented 6 years ago

Can you check in the appropriate build.gradle file?

pditommaso commented 6 years ago

Oops. I was sure it was there. Just pushed.

lukasjelonek commented 6 years ago

I implemented and tested it and issued a pull request. I don't know if it was intended or a bug, but the ClientOptions were not used when an access and secret key were provided. Finding this problem took most time of the implementation.

pditommaso commented 6 years ago

What is supposed to be a valid value for the signer_override property?

pditommaso commented 6 years ago

I've uploaded a new N snapshot including your PR. You may want to give a try using

NXF_VER=0.26.3-SNAPSHOT nextflow run .. etc

That property should be defined in the config file as

aws.client.signerOverride = 'something'
lukasjelonek commented 6 years ago
aws.client.signerOverride = 'S3SignerType' 

is used for s3 implementations that require aws2 signing instead of aws4 signing. The snapshot works with our s3 service 👍

pditommaso commented 6 years ago

Great!

lukasjelonek commented 6 years ago

Thanks for including this patch

pditommaso commented 6 years ago

You are welcome. I will upload a new minor release soon.