danielfrg / s3contents

Jupyter Notebooks in S3 - Jupyter Contents Manager implementation
Apache License 2.0
248 stars 88 forks source link

Is there a way to have it mount and display 2 S3 buckets? #124

Open Snowned0425 opened 3 years ago

Snowned0425 commented 3 years ago

Our user case is we want each user to have a place in S3 where they can store their own code, but also access to a shared bucket with notebooks that everyone sees. I have been able to make it do one or the other, but not both at the same time.

I use this to have each user have their JupyterLab load a unique folder specific to their username on S3:

          user = os.environ['JUPYTERHUB_USER']
          config.S3ContentsManager.prefix = os.path.join("jupyter", user)

I use this when I want each user to mount the same shared folder in the bucket:

          config.S3ContentsManager.prefix = "jupyter/shared"

I can share the entire config if needed, but these are the relevant parts. Looking for some way to have it so I can mount both, and when a user logs in they see both mounts. Is this possible?

danielfrg commented 3 years ago

You can probably use the HybridsContentManager to accomplish this: https://github.com/danielfrg/s3contents#access-local-files

Let us know if you are able to try it.

Snowned0425 commented 3 years ago

I did read through their docs and I didn’t see any way to use two different s3 buckets. I only saw how to use both s3 and local storage

pvanliefland commented 3 years ago

Hey @Snowned0425 , we do something like this and it works just fine:

c.NotebookApp.contents_manager_class = HybridContentsManager

c.HybridContentsManager.manager_classes = {
    "bucket1": S3ContentsManager,
    "bucket2": S3ContentsManager
}

# Each item will be passed to the constructor of the appropriate content manager.
c.HybridContentsManager.manager_kwargs = {
    # Args for root LargeFileManager
    "bucket1": {
        "access_key_id": ...,
        "secret_access_key": ...,
        "bucket": ...,
    },
    "bucket2": {
        "access_key_id": ...,
        "secret_access_key": ...,
        "bucket": ...,
    },
}
Snowned0425 commented 3 years ago

Thanks. We ended up abandoning the s3 module because it’s incompatible with the git module, since the git module requires the files to be local instead of in s3. We are now looking at using s3fs to mount a bucket as our /home directory so that jupyter thinks they’re local files and then git will work.