Develop - Githubissues

cetic / helm-fadi

Helm Chart for FADI

https://github.com/cetic/fadi

Apache License 2.0

13 stars 4 forks source link

Develop #23

Closed AyadiAmen closed 4 years ago

AyadiAmen commented 4 years ago

What this PR does / why we need it:

This PR updates the jupyterhub helm chart and adds a tensorflow notebook.

Special notes for your reviewer:

TensorBoard ( a tool for providing the measurements and visualizations needed during the machine learning workflow ) might be added later on using a separate helm chart.

Checklist

[Place an '[x]' (no spaces) in all applicable fields. Please remove unrelated fields.]

[x] DCO signed
[x] Chart Version bumped
[ ] Variables are documented in the README.md

AyadiAmen commented 4 years ago

@alexnuttinck I think it is, Tensorflow is a library for dataflow and differentiable programming used to create models and that image is sufficient to do so, I tested: https://www.tensorflow.org/tutorials/quickstart/beginner & https://www.tensorflow.org/tutorials/load_data/csv.

@titsitits what do you think ?

alexnuttinck commented 4 years ago

Notebook image seems to be enough. I merge this PR.

titsitits commented 4 years ago

Hi @alexnuttinck,

I tested and it seems to work well in minikube. So in that respect I would approve the pull request.

Note however that Tensorflow without GPU support has a limited interest; and I'm not sure that the tensorflow version installed in this docker image supports GPU. To be sure, it should be tested on a machine/cluster with a NVIDIA gpu and CUDA enabled. (Note that it is a more general remark than for tensorflow: other famous machine learning libraries are also far more efficient on GPUs, especially decision-trees-based libraries such as XGBoost or CatBoost).

titsitits commented 4 years ago

Hi @alexnuttinck @AyadiAmen @banzo ,

After investigation, I notice that from Tensorflow 2.0, the same python library (installed by pip) is used either for cpu and gpu usage: https://www.tensorflow.org/install/gpu So the docker image you use should be ok: https://hub.docker.com/r/jupyter/tensorflow-notebook/dockerfile

However, GPU support must still be enabled by installing the nvidia driver and CUDA. Infos about that here also: https://www.tensorflow.org/install/gpu Here are examples of docker stacks including gpu support: https://github.com/jolibrain/docker-stacks A test in a Google Cloud VM/Cluster could be interesting!

Also note that KubeFlow could also be an ultimate solution if it can be coupled/integrated with FADI (it already integrates various data science tools, including jupyter, ML workflow management and deployment): https://www.kubeflow.org/