StatCan / aaw

Documentation for the Advanced Analytics Workspace Platform
https://statcan.github.io/aaw/
Other
68 stars 12 forks source link

Finalize issues of S3 proxy features to go live #1538

Closed chuckbelisle closed 1 year ago

chuckbelisle commented 1 year ago

Refer to Blob-CSI epic https://github.com/StatCan/daaas/issues/1001 and move forward with go-live of Blob-CSI and S3 proxy features.

Follow the confirmation of the feature functioning in Prod we will need to action the following

Collinbrown95 commented 1 year ago

Final Development Items

TODO

Collinbrown95 commented 1 year ago

Testing and Documentation

1. Remove s3proxy from a user namespace that no longer requires it

Case 1: Profile is created from https://github.com/StatCan/aaw-kubeflow-profiles

I created the collin-test profile for this test.

  1. In the jsonnet file for the profile, wrap the profile in the addS3 function (see example for my profile here: https://github.com/StatCan/aaw-kubeflow-profiles/commit/a5d4883e678f04c6e245ef9e413c3c5c2952bbcf)

  2. User logs into kubeflow (dev) with usual SSO

  3. kubeflow central dashboard configmap populates "Storage" option on the left menu of Kubeflow

  4. User clicks "Storage" and is presented the aws-js-s3-explorer UI. User must select Unclassified, Unclassified read only, or Protected-b

Note: On the first page load, the service worker may not get loaded into the browser. This can lead to a false "403" forbidden error message. As described in the s3proxy developer docs, we have a service worker that removes an erroneous authorization header that is added automatically by the client-side s3 javascript library used by the aws-js-s3-explorer tool.

Case 1.1: Select "Unclassified" bucket

  1. User creates test prefix (folder) in s3 explorer, folder is created successfully.

  2. User uploads moderately large single file (122 MB), file is uploaded successfully.

Note: The UI is a bit finicky with file uploads; the progress bar shows 0% progress and turns green when the file upload completes. The file is uploaded correctly, but the UI appears as though no progress has been made.

Note: the aws-js-s3-explorer UI does not currently allow users to download files through the kubeflow interface. The issue is that the request URL in the aws js s3 explorer application does not have the /s3/<user namespace> prefix; this works correctly in a noVNC notebook when requests are made directly against the s3proxy-web service in the cluster. For example, in a noVNC notebook, users can visit s3proxy-web/s3/<user-namespace>/ (or s3proxy-protected-b-web/s3/<user-namespace>/ in a protected b environment) and connect directly to the s3proxy pod within the cluster.

Jose-Matsuda commented 1 year ago

Mathis and Jose work

In relation to Collin's comment here. ✔️ means good to go ❌ means problem (and in this case so does ❗, but was not in Collin's initial comment ). ⭕ means task to be completed.

✔️ Removing Unusable Items from the Menu

The PR just needed slight tweaking as initially it complained about not being able to find the index.html. Fixing it up results in this. As a result when navigating to it on the kubeflow interface you only get unclassified. What we do need to verify is how to get to the protected b version while in novnc and we should only see unclassified-ro and protectedb. --> We accessed this during tech elab using the internal service of s3proxy-protected-b-web and were able to verify seeing those two volumes.

✔️ Verify that the appropriate volumes are mounting correctly to unclassified and protected-b instances of s3proxy.

To test this I started up a few notebooks. The PVC's of aaw-protected-b, aaw-unclassified-ro only got mounted to protected b pods and aaw-unclassified was only mounted to non protected b pods.

❌Need to ensure that there is persistence across s3proxy restarts

However, note that if you delete the s3proxy pod in your namespace (or restart the deployment) any files that were uploaded are gone. image image image image

⭕ VS for storage button

❗ IMPORTANT Things to Note

We are able to access kubeflow central dashboard from the protected b remote desktop and as a result, can interact with the unclassified volume. You are also able to upload to the unclassifed-ro volume.

Collinbrown95 commented 1 year ago

Notes from debugging session with @Jose-Matsuda and @mathis-marcotte : https://github.com/StatCan/daaas-private/issues/80

Discussion includes some networking details which is why it is outlined in a private repo.

mathis-marcotte commented 1 year ago

Closing this issue. The work will progress through other issues, including