apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.19k stars 3.58k forks source link

Improve handling of S3 offloading configuration in other regions #9084

Open addisonj opened 3 years ago

addisonj commented 3 years ago

Is your enhancement request related to a problem? Please describe. Currently, when using S3 offloading in a region other than us-east-1, the underlying jclouds library does some non-standard AWS things which require both:

A) the policy Pulsar is running as to have GetBucketLocation permissions and B) to change the endpoint to use a region specific endpoint

This causes confusion which is not well documented and difficult to explain, and different from most AWS implementations. See https://github.com/apache/pulsar/issues/3833 for context

Describe the solution you'd like We should do 2 things

  1. See if we can eliminate the need for GetBucketLocation, looking at https://github.com/apache/jclouds/blob/31a3e5b5df1543d04098e3a694130b7ae8e6e079/apis/s3/src/main/java/org/jclouds/s3/config/S3HttpApiModule.java#L91 it appears to only be used when jclouds detects multiple regions. Where jcloud is getting more than one region from isn't clear, but if the user sets a region, we should just use that single region and skip the getBucketLocationCheck
  2. Ensure that setting just the region is sufficient to configure the correct endpoint. Getting rid of the GetBucketLocation check may be sufficient such that the default endpoint works, otherwise, we should build the correct endpoint name if the region is specified but no endpoint is manually provided

Describe alternatives you've considered Another consideration (and perhaps still a longer term goal) is to replace the use of jcloud for AWS (but still use it for other cloud providers) as jcloud does have some other behavior that differs from AWS.

Additional context

codelipenghui commented 3 years ago

@Renkai Could you please take a look at this issue?

Renkai commented 3 years ago

If my understand is not wrong, current jcloud API we are using event don't permit us to set region manually, it have to be generated by endpoint https://github.com/apache/pulsar/blob/98ad39ffa51239e389c73411dfb8df7f5592a5aa/tiered-storage/jcloud/src/main/java/org/apache/bookkeeper/mledger/offload/jcloud/provider/JCloudBlobStoreProvider.java#L283

Maybe we should help Jcloud make right error info, or we can use the official SDK of AWS to generate Jcloud compatible endpoint in advance.

Renkai commented 3 years ago

It might be a long journey to refactor our blob client to solve this issue

Anonymitaet commented 3 years ago

Confirmed w/ @Renkai, it might take some time to solve this issue, but now I can add the workaround to docs to make users clear.

Renkai commented 3 years ago

A workaround patch in doc is available here. https://github.com/apache/pulsar/pull/9366

zymap commented 3 years ago

Move this to the next release.

congbobo184 commented 3 years ago

Move this to the next release.

michaeljmarshall commented 2 years ago

Removing the release label. We need a PR to have a release.

dave2wave commented 2 years ago

If you are not in us-east-1 then you can workaround this issue in broker.conf

I'm using terraform and ansible - here's the broker.conf template

s3ManagedLedgerOffloadBucket={{ s3_bucket }}
s3ManagedLedgerOffloadRegion={{ s3_region }}
s3ManagedLedgerOffloadServiceEndpoint={{ s3_url }}

Here is the setting of these facts in the ansible playbook.

        s3_bucket: "{{ tf_s3_bucket }}"
        s3_region: "{{ tf_s3_region }}"
        s3_url: "https://{{ tf_s3_bucket }}.s3.{{ tf_s3_region }}.amazonaws.com"

I read the bucket and region from a file created by terraform:

    - name: Get variables from terraform
      include_vars: ./tf_ansible_vars.yaml

In terraform the bucket and region are from the terraform.tfvars and written like so:

region          = "us-west-2"
ami             = "ami-9fa343e7" // RHEL-7.4
s3_bucket       = "omb-testing-1"
# Export Terraform variable values to an Ansible var_file
resource "local_file" "tf_ansible_vars" {
  content = <<-DOC
    tf_s3_bucket: ${var.s3_bucket}
    tf_s3_region: ${var.region}
    DOC
  filename = "./tf_ansible_vars.yaml"
}
jcalcote commented 2 years ago

Here's a possible work around for some people's use cases: https://stackoverflow.com/questions/73169813/jclouds-getbucketlocation-timeout-on-getblob/73902608#73902608

Note that this ticket also suggests a possible general solution whereby jclouds provides a mechanism for users to pre-load the bucket-to-region LoadingCache with key/value pairs for their buckets that are in known regions. This would seem to be simpler than trying to rework the api from top to bottom to pass in the user-specified endpoint.