livebook-dev / livebook

Automate code & data workflows with interactive Elixir notebooks
https://livebook.dev
Apache License 2.0
4.91k stars 422 forks source link

Unable to Set S3 Bucket URL for Team Workspace #2826

Open zhihuizhang17 opened 1 week ago

zhihuizhang17 commented 1 week ago

Background

We use AWS IAM to grant Kubernetes pods access to an S3 bucket for our services via adding annotation iam.amazonaws.com/role in Deployment yaml. The bucket is accessible only by the Livebook team's service account, which means that access is restricted to pods within the Kubernetes cluster. As a result, I am unable to access the S3 bucket from a local Livebook instance, and I can only set the bucket URL for my personal workspace using the Livebook team's instance.

However, I am facing an issue when trying to set the bucket_url for my team's workspace from the Livebook team's pod. I encounter the following error message:

You are not authorized to perform this action, make sure you have the access and you are not in a Livebook App Server/Offline instance
image

Proposed Solution

To resolve this issue, I propose adding a new environment variable that allows us to set the bucket_url when LIVEBOOK_AWS_CREDENTIALS is set to true. This would help in configuring access appropriately when running in environments with restricted access, such as within a Kubernetes pod.

BTW, there is A TYPO in above snapshot. https://github.com/livebook-dev/livebook/pull/2824

jonatanklosko commented 1 week ago

Hey @zhihuizhang17! Just to make sure we are on the same page. When developing locally you use a Teams workspace and have an S3 file storage that is accessible for you locally. Then, on the pod instance running Livebook app server, you want to use a different URL for that file storage, which is only accessible from the pod. I assume your use case is reading files from the storage in a notebook via Kino.FS, so the expectation is that both S3 buckets have file at the same path. Is this correct?

zhihuizhang17 commented 1 week ago

Hey @jonatanklosko! Thanks for your reply. I think there might be a misunderstanding. Let me clarify our use case:

  1. I'm not trying to use different S3 buckets for local and pod environments. There is only one S3 bucket that I want to access.

  2. The current situation:

    • This S3 bucket is only accessible from within the Kubernetes cluster via IAM role (through the iam.amazonaws.com/role annotation in our Deployment)
    • I cannot access this bucket locally because I don't have direct AWS credentials
    • I can only access this bucket through pods in the Kubernetes cluster
  3. The specific issue I'm facing:

    • I can successfully set the bucket_url for my personal workspace using the Livebook instance running in the pod
    • However, when I try to set the bucket_url for my team's workspace from the same pod, I get the error:
      You are not authorized to perform this action, make sure you have the access and you are not in a Livebook App Server/Offline instance
    • This happens despite the pod having the correct IAM role and being able to access the S3 bucket
  4. My proposed solution:

    • Add a new environment variable that would allow setting the bucket_url when LIVEBOOK_AWS_CREDENTIALS is true
    • This would help in scenarios where the Livebook instance has indirect access to S3 (like through IAM roles) rather than direct AWS credentials
josevalim commented 1 week ago

@zhihuizhang17 you are getting that error because the Livebook instance you are using is an "agent" instance. It is only meant to run deployed notebooks and therefore it is read only (in fact, we should not be allowing you set the file system for the personal workspace).

You should deploy a regular Livebook instance, listed here, and then they should all work. We should probably make this clearer.

zhihuizhang17 commented 1 week ago

You should deploy a regular Livebook instance.

@josevalim Sorry, I'm a bit confused now. I followed the guide you mentioned to deploy my Livebook team service. The only change I made:

  1. change type: LoadBalancer to type: ClusterIP since there is no load balancer in my Kubernetes mesh.
  2. add kind: VirtualServer to assign a intranet domain.

I believe I have already set it up correctly. How can I determine whether it's an agent instance or a regular Livebook instance?

josevalim commented 1 week ago

@zhihuizhang17 when you deployed it, did you have to manually add your Livebook Teams workspace to that instance or it was already there? Have you set LIVEBOOK_TEAMS_AUTH or LIVEBOOK_TEAMS_KEY env vars?

zhihuizhang17 commented 1 week ago

when you deployed it, did you have to manually add your Livebook Teams workspace to that instance or it was already there?

Yes, it was already there.

Have you set LIVEBOOK_TEAMS_AUTH or LIVEBOOK_TEAMS_KEY env vars?

Yes. I set them in deployment.yamlas follow:

            - name: LIVEBOOK_COOKIE
              valueFrom:
                secretKeyRef:
                  name: livebook-secret
                  key: LIVEBOOK_COOKIE
            - name: LIVEBOOK_SECRET_KEY_BASE
              valueFrom:
                secretKeyRef:
                  name: livebook-secret
                  key: LIVEBOOK_SECRET_KEY_BASE
            - name: LIVEBOOK_TEAMS_AUTH
              valueFrom:
                secretKeyRef:
                  name: livebook-secret
                  key: LIVEBOOK_TEAMS_AUTH
            - name: LIVEBOOK_TEAMS_KEY
              valueFrom:
                secretKeyRef:
                  name: livebook-secret
                  key: LIVEBOOK_TEAMS_KEY
josevalim commented 1 week ago

Yes, that's precisely the issue. Once you set those two env vars, it becomes an "agent". You should remove them and then you will be able to add your org as usual, as you did on your local machine, with full access. I will improve the docs.

zhihuizhang17 commented 1 week ago

@josevalim It works!!! However, there are still some unusual behaviors:

  1. After I removed LIVEBOOK_TEAMS_KEY and LIVEBOOK_TEAMS_AUTH, the warning 'You are running a Livebook app server. This workspace is in read-only mode.' disappeared, and I was able to set the bucket_url. However, when I refreshed the UI after I set it, the warning returned. It means I am unable to detach the S3 bucket URL from the cluster instance—I have to detach it from my local Livebook instance. This means that I can only set the S3 bucket via the cluster instance but can only remove it via the local instance.

  2. After executing kubectl rollout restart deployment/livebook to redeploy the pods, the Web UI becomes unstable when refreshing the page. The team workspace sometimes disappears and then reappears after next refresh. Local apps I deployed disappear too. My version is v0.14.4.

zhihuizhang17 commented 1 week ago

Update As of now, the best practice for me seems to be:

  1. Deploy instances without setting LIVEBOOK_TEAMS_KEY and LIVEBOOK_TEAMS_AUTH.

  2. Set the S3 bucket URL for the team workspace from a Kubernetes instance, which allows the Livebook-dev console to store the configuration and sync it across all instances.

  3. Redeploy instances with LIVEBOOK_TEAMS_KEY and LIVEBOOK_TEAMS_AUTH set.

This ensures that the team workspace is always present when I refresh the UI after the next deployment.

josevalim commented 1 week ago

After executing kubectl rollout restart deployment/livebook to redeploy the pods, the Web UI becomes unstable when refreshing the page. The team workspace sometimes disappears and then reappears after next refresh. Local apps I deployed disappear too. My version is v0.14.4.

This is happening because you have multiple instances and the workspace connection is per instance. We currently don't sync them across the cluster. If you run a single instance (as you would run a single instance on your machine), then you should be fine.

Thank you for the feedback, we will discuss internally how to improve the workflow. Meanwhile, I will update the docs to mention it should be a single instance.