Closed BohuTANG closed 2 years ago
Found another issue with setting credentials
and S3_STORAGE_ENDPOINT_URL
:
export S3_STORAGE_ENDPOINT_URL=https://s3.amazonaws.com
displayText = Parse csv error at line 0, cause: unexpected: (op: read, path: /dataset/stateful/ontime.csv, source: Error { code: "PermanentRedirect", message: "The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.", request_id: "xx", s3_extended_request_id: "xx" }).
Update:
I have test with the url: https://s3.amazonaws.com/repo.databend.rs/dataset/stateful/ontime.csv -- returns the same error
If we append the region, it works well: https://s3.us-east-2.amazonaws.com/repo.databend.rs/dataset/stateful/ontime.csv
export S3_STORAGE_ENDPOINT_URL=https://s3.amazonaws.com
It's buggy now.
The behavior between keeping S3_STORAGE_ENDPOINT_URL
empty and setting S3_STORAGE_ENDPOINT_URL=https://s3.amazonaws.com/
is different.
S3_STORAGE_ENDPOINT_URL=""
: s3 SDK will use format!("s3.{}.amazonaws.com", region)
to construct the correct endpoint.S3_STORAGE_ENDPOINT_URL=https://s3.amazonaws.com/
: s3 SDK will take this input as static, and can't handle the region correctly.I'm considering removing the region
option so that:
https://s3.amazonaws.com/
and detect the region for users.http://127.0.0.1:9000
Thank you.
The attempting to access must be addressed using the specified endpoint
error issue already addressed in:
https://github.com/datafuselabs/opendal/issues/37
A bit complex but it is possible.
SDK's maintainer gives a workaround: https://github.com/awslabs/aws-sdk-rust/issues/425#issuecomment-1020265854
And I think we can inject the middleware
to make it better:
let builder = aws_smithy_client::Builder::dyn_https()
.middleware(AwsS3::middleware::DefaultMiddleware::new()) <-- inject middleware here.
.build();
Ok(Arc::new(Backend {
// Make `/` as the default of root.
root,
bucket: self.bucket.clone(),
client: AwsS3::Client::with_config(builder, cfg.build()),
}))
We can config OperationSigningConfig.signing_requirements
to SigningRequirements::Disabled
maybe.
But it may take some time to figure out how to make it works.
Thanks. Maybe we can prioritize this issue https://github.com/datafuselabs/opendal/issues/12 first, so that databend side can continue to list all files and read them for COPY command.
Maybe we can use:
let client = aws_sdk_s3::Client::with_config(
aws_sdk_s3::client::Builder::new()
.connector(DynConnector::new(my_custom_connector)) // Now with `DynConnector`
.middleware(DynMiddleware::new(my_custom_middleware)) // Now with `DynMiddleware`
.default_async_sleep()
.build(),
config
);
Read https://github.com/awslabs/aws-sdk-rust/releases/tag/v0.7.0 for more details.
Just to note, here is my today's exploration, I get the props
from the request successfully:
use aws_smithy_client::{Builder, Client};
let hyper_connector = aws_smithy_client::hyper_ext::Adapter::builder()
.build(aws_smithy_client::conns::https());
// let connector = DynConnector::new()
let x = aws_smithy_client::Builder::new()
.connector(hyper_connector)
.middleware(aws_smithy_client::erase::DynMiddleware::new(
tower::util::MapRequestLayer::new(|req: aws_smithy_http::operation::Request| {
let (inner, mut props) = req.into_parts();
{
let mut p = props.acquire_mut();
let x = p.get_mut::<usize>();
println!("{:?}", x);
}
aws_smithy_http::operation::Request::from_parts(inner, props)
}),
))
.build();
let _ = aws_sdk_s3::Client::with_config(x.into_dyn(), cfg.build());
It works now!
Background
The
s3://repo.databend.rs/dataset/stateful/ontime.csv
ACL is Everyone (public access) on AWS S3.If the credentials not set for OpenDAL:
will get the error:
Enhancement It would be better to add a test for minio bucket with public ACL.