Open bnol opened 2 weeks ago
Hey @bnol — thanks for submitting this issue!
I've never used min.io, but because it is S3-compatible, it might just work with BemiDB out of the box.
I'm not sure how authentication and authorization works. If it is compatible with AWS S3 SDKs, would you be able to try something like this:
bemidb \
--storage-type AWS_S3 \
--iceberg-path iceberg \
--aws-region us-east-1 \ # hardcoded https://github.com/minio/minio/discussions/15063
--aws-s3-bucket [YOUR_BUCKET] \
--aws-access-key-id [MINIO_ROOT_USER] \
--aws-secret-access-key [MINIO_ROOT_PASSWORD] \
start
It's necessary to configure the endpoint as well, not just the region, because the SDK will point to the AWS endpoints.
it would be something like this:
// Add these fields to your Config struct
type Config struct {
Aws struct {
AccessKeyId string
SecretAccessKey string
Region string
S3Bucket string
Endpoint string // New field for custom endpoint
ForcePathStyle bool // New field for path style addressing
}
// ... other existing fields
}
func NewS3Storage(config *Config) *StorageS3 {
awsCredentials := credentials.NewStaticCredentialsProvider(
config.Aws.AccessKeyId,
config.Aws.SecretAccessKey,
"",
)
var logMode aws.ClientLogMode
// if config.LogLevel == LOG_LEVEL_DEBUG {
// logMode = aws.LogRequest | aws.LogResponse
// }
// Create custom endpoint resolver if endpoint is specified
var endpointResolver aws.EndpointResolverWithOptions
if config.Aws.Endpoint != "" {
endpointResolver = aws.EndpointResolverWithOptionsFunc(func(service, region string, options ...interface{}) (aws.Endpoint, error) {
return aws.Endpoint{
URL: config.Aws.Endpoint,
SigningRegion: config.Aws.Region,
HostnameImmutable: true,
}, nil
})
}
// Load AWS config with custom options
configOptions := []func(*awsConfig.LoadOptions) error{
awsConfig.WithRegion(config.Aws.Region),
awsConfig.WithCredentialsProvider(awsCredentials),
awsConfig.WithClientLogMode(logMode),
}
// Add custom endpoint resolver if specified
if endpointResolver != nil {
configOptions = append(configOptions,
awsConfig.WithEndpointResolverWithOptions(endpointResolver))
}
// Add force path style option if specified
if config.Aws.ForcePathStyle {
configOptions = append(configOptions,
awsConfig.WithS3ForcePathStyle(true))
}
loadedAwsConfig, err := awsConfig.LoadDefaultConfig(
context.Background(),
configOptions...,
)
PanicIfError(err)
return &StorageS3{
s3Client: s3.NewFromConfig(loadedAwsConfig),
config: config,
storageBase: &StorageBase{config: config},
}
}
Then with a few options it would be possible to use the SDK with any S3-compatible service that supports the AWS S3 API. The options would be:
For MinIO:
aws:
endpoint: "http://minio-server:9000"
region: "us-east-1" # Can be any valid region
s3_bucket: "your-bucket"
access_key_id: "your-access-key"
secret_access_key: "your-secret-key"
force_path_style: true # MinIO typically requires path-style addressing
For Backblaze B2 / Digital Ocean / Wasabi / etc.:
aws:
endpoint: "https://s3.us-west-001.backblazeb2.com" # Use your B2 region
region: "us-west-001" # B2 region
s3_bucket: "your-bucket"
access_key_id: "your-keyID"
secret_access_key: "your-applicationKey"
force_path_style: false # B2 uses virtual-hosted-style addressing
@renatocron perhaps you should submit a PR? :D
It's necessary to configure the endpoint
Ah, you're right, good point! We, unfortunately, won't be able to prioritize it in the coming days because we'll be building some of the things mentioned in our Future Roadmap first (e.g., native support for complex data structures like JSON and arrays)
But if anyone would like to submit a PR, we'll be very happy to help get it merged. At a high level, here are the main things that would need to be updated:
--aws-s3-endpoint
similarly to other arguments (see config.go
)BaseEndpoint
when creating an S3 client (see NewFromConfig
in the code)ENDPOINT
into the DuckDB's secret to be able to read Iceberg tables (see aws_s3_secret
in the code)
Support custom S3-compatible endpoint such as min.io