Closed daddydrac closed 3 years ago
@ericdill could you plz triage this question, i am very thankful for your time answering this in advance.
What problem are you currently having? What config have you set in your jupyter_notebook_config.py?
@ericdill: I’ll be honest, I can’t talk about it in the open due to security constraints and federal laws. I just need to know where and how you’re generating the string that connects to the bucket itself. I wish I could share any part of the code but I would be in a lot of trouble.
@ericdill I am trying to configure a special type of s3 for gov't. I just need an explanation on how you are generating the url/connection bucket strings.
Have you looked through this codebase to see how and where endpoint_url
is being used? Ultimately this code is a wrapper around dask/s3fs which itself is using aiobotocore to handle all of the interacting with the s3 APIs, so all we're doing here is passing args down to those libraries. If you are needing to understand exactly how those connection strings are being formed then you'll probably need to explore the aiobotocore library to figure that one out. I'm not particularly familiar with how those strings are being formatted
@ericdill: Going back to the code ->
_endpoint_url = Unicode("https://s3.amazonaws.com", help="S3 endpoint URL")_
is https://s3.amazonaws.com
a hard coded string that is used statically throughout?
endpoint_url
is a variable that is passed to the s3fs library which is then passed to aiobotocore to actually initiate a connection to whatever s3 provider you're using. If you want to use a different s3 endpoint then provide a different value for the endpoint_url
in your jupyter_notebook_configuration.py
. If you look at the readme you'll see a bunch of things being set:
from s3contents import S3ContentsManager
c = get_config()
# Tell Jupyter to use S3ContentsManager for all storage.
c.NotebookApp.contents_manager_class = S3ContentsManager
c.S3ContentsManager.access_key_id = "{{ AWS Access Key ID / IAM Access Key ID }}"
c.S3ContentsManager.secret_access_key = "{{ AWS Secret Access Key / IAM Secret Access Key }}"
c.S3ContentsManager.session_token = "{{ AWS Session Token / IAM Session Token }}"
c.S3ContentsManager.bucket = "{{ S3 bucket name }}"
# Optional settings:
c.S3ContentsManager.prefix = "this/is/a/prefix/on/the/s3/bucket"
c.S3ContentsManager.sse = "AES256"
c.S3ContentsManager.signature_version = "s3v4"
c.S3ContentsManager.init_s3_hook = init_function # See AWS key refresh
If you add
c.S3ContentsManager.endpoint_url = whatever_url_you_want
then things will probably work. I assume you've already tried this and it didnt work?
Yes @ericdill I did try that. As for endpoint_url
is this an access endpoint to perform bucket operations, or is it the URL to the bucket itself?
** I am also of the opinion that this does not work on AWS GovCloud, at this point.
Got it working and here is the answer: https://github.com/dask/helm-chart/pull/78#issuecomment-737403661
*I'll close this as well now, thank you for your help @ericdill !!!
In the file https://github.com/danielfrg/s3contents/blob/master/s3contents/s3manager.py, on line 20 you have:
Is this the part of the URL you use to connect to the s3 bucket in order to read/write to it?
If not, where does that code exist? Please advise.