kubeflow / kubeflow

Machine Learning Toolkit for Kubernetes
https://www.kubeflow.org/
Apache License 2.0
14.35k stars 2.41k forks source link

Use Tensorboard with Object Store #6493

Closed N-Kingsley closed 2 weeks ago

N-Kingsley commented 2 years ago

How to write a right Object Storage Link? For example, how to set the access key and secret key in the link when using the Minio. image

kimwnasptd commented 2 years ago

This is currently not supported, but a great feature to implement.

We've solved this problem in Notebooks by using PodDefaults for attaching credentials, but we might need to have some more logic for TensorBoard.

@N-Kingsley have you managed to configure TensorBoard locally to access MinIO?

N-Kingsley commented 2 years ago

This is currently not supported, but a great feature to implement.

We've solved this problem in Notebooks by using PodDefaults for attaching credentials, but we might need to have some more logic for TensorBoard.

@N-Kingsley have you managed to configure TensorBoard locally to access MinIO?

No, how should I solve this problem?

hwk42 commented 2 years ago

I solved this problem by add credential env for tensorboard pod. Is it a plan to support in v1.6?

tks-ai commented 2 years ago

I think it would be a very useful feature to support s3. I am struggling with this problem for a few days ,sad to know this is not supported, If take a fully implement will take time, how about just pass the credential of kubeflow built-in minio to Tensorboard by ConfigMap. Think this would be a quick solution.

kimwnasptd commented 2 years ago

AFAIK TensorBoard should be able to work with credentials that are set by ENV Variables https://github.com/tensorflow/tensorboard/blob/master/tensorboard/summary/writer/event_file_writer_s3_test.py#L35-L36

Ideally we should extend our TensorBoard CR to make it possible to assign PodDefaults to the underlying Pod that will be created. This way we can have the same approach for injecting AWS credentials to both Notebook and TensorBoard servers.

This will require some changes to both the web app and the underlying CR though. Any takers?

kimwnasptd commented 2 years ago

@wyljpn just saw your comment in the other issue, let's use this one for this specific feature https://github.com/kubeflow/kubeflow/issues/3578#issuecomment-1032417871

First of all nice work! As mentioned above instead of expecting users to edit ENV vars directly let's instead use PodDefaults, so that we can reuse this part for Notebooks as well. We can also document a PodDefault that sets the ENV Vars accordingly.

The changes I have in mind are:

  1. Extend the spec of the TensorBoard CR to allow users to edit the labels that will end up in the Pod. We can do .spec.labels for this in the CRD
  2. Extend the frontend to have a Configurations section, just like we have for Notebooks
  3. Extend the backend to set the PodDefaults labels to the CR, based on the frontend's data

We can break this down to small PRs for each one as well

laserK3000 commented 2 years ago

As a side-note: I went down the route as described in #3578 and compiled a custom controller. It works, however logs in subfolders are not being listed if the tensorboard points to the parent directory (which is annoying as it requires one board per run). It is different if one uses a more recent version of the tensorflow/tensorflow container (I tried 2.9.1). Unfortunately these newer versions lack tensorflow-io and hence don't work with s3 out of the box. I got everything to work after compiling my own tensorboard container with tensorboard-io.

N-Kingsley commented 2 years ago

I solved this problem by add credential env for notebook pod. Is it a plan to support in v1.6?

Could you tell me how to solve this problem?

hwk42 commented 2 years ago

Just create a secret and poddefault before create tensorboard for quick workaround. The name tb-test in selector field is same with tensorboard.Generally,label should be add in tensorboard CR

apiVersion: "kubeflow.org/v1alpha1"
kind: PodDefault
metadata:
name: add-minio-secret
namespace: <your tb namespace>
spec:
selector:
matchLabels:
app: "tb-test"
desc: "add minio credential"
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: minio-secret
key: accesskey
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: minio-secret
key: secretkey
- name: S3_ENDPOINT
valueFrom:
secretKeyRef:
name: minio-secret
key: endpoint
- name: S3_USE_HTTPS
valueFrom:
secretKeyRef:
name: minio-secret
key: usehttps
- name: S3_VERIFY_SSL
valueFrom:
secretKeyRef:
name: minio-secret
key: verifyssl
---
kind: Secret
apiVersion: v1
metadata:
name: minio-secret
stringData:
accesskey: minio
secretkey: minio123
endpoint: minio-service.kubeflow:9000
usehttps: "0"
verifyssl: "1"
d-gol commented 1 year ago

As mentioned above instead of expecting users to edit ENV vars directly let's instead use PodDefaults, so that we can reuse this part for Notebooks as well. We can also document a PodDefault that sets the ENV Vars accordingly.

The changes I have in mind are:

  1. Extend the spec of the TensorBoard CR to allow users to edit the labels that will end up in the Pod. We can do .spec.labels for this in the CRD
  2. Extend the frontend to have a Configurations section, just like we have for Notebooks
  3. Extend the backend to set the PodDefaults labels to the CR, based on the frontend's data

We can break this down to small PRs for each one as well

@kimwnasptd this is a great idea! I can implement this, if no one is working on it?

AlexandreBrown commented 1 year ago

Is this issue fixed @kimwnasptd @surajkota ?

surajkota commented 1 year ago

@AlexandreBrown yes and it is available in 1.7-rc1, you can find the instructions on the description of this PR - https://github.com/kubeflow/kubeflow/pull/6924

andreyvelich commented 2 weeks ago

This has been fixed in #6924