Choose right image - for fairing and JupyterNotebook.
People still use default one for fairing..
I notice someone use GPU container and failed to import tensorflow because container doesn't have GPU.
Fairing - kubeflow-pipeline-data is inconspicuous.
Users actually have to change to their own bucket. We want to change the code make sure if use env variables by default.
Can not submit Sagemaker jobs.
Very similar to this. https://github.com/aws-samples/eks-workshop/issues/603
The reason is users may miss step to use sagemaker user. I would suggest to use IAM user for all workshops. Create a workshop user and attach Sagemaker, S3, ECR permissions.
Move Training & Serving in Notebook
Users said model training and model inference are not part of Kubeflow. We probably want to consider to move to jupyter notebook based training and inference. Just an example to practice notebook.
Shorten time to setup cluster.
If people don't care the cluster setups, probably write a scripts to setup users and bring up a cluster
Users copy & paste output from instruction. I would suggest no to use exact same markdown for outputs.
https://eksworkshop.com/advanced/420_kubeflow/pipelines/
For example, this is not something we want user to copy and run.
Problems
export AWS_REGION=
and have errors in https://eksworkshop.com/030_eksctl/test/https://eksworkshop.com/020_prerequisites/workspaceiam/
People may do themselves and miss what instructor are talking.
People still go to setup SSH key and can not find next section https://eksworkshop.com/020_prerequisites/sshkey/
Choose right image - for fairing and JupyterNotebook. People still use default one for fairing.. I notice someone use GPU container and failed to import tensorflow because container doesn't have GPU.
Fairing -
kubeflow-pipeline-data
is inconspicuous. Users actually have to change to their own bucket. We want to change the code make sure if use env variables by default.Can not submit Sagemaker jobs. Very similar to this. https://github.com/aws-samples/eks-workshop/issues/603 The reason is users may miss step to use
sagemaker
user. I would suggest to use IAM user for all workshops. Create a workshop user and attachSagemaker
,S3
,ECR
permissions.Batch Transformation Failure https://github.com/aws-samples/eks-workshop/issues/521
Increase cluster size is probably unnecessary https://eksworkshop.com/advanced/420_kubeflow/install/ 3 nodes probably are good enough. This takes extra 3 mins.
Move Training & Serving in Notebook Users said model training and model inference are not part of Kubeflow. We probably want to consider to move to jupyter notebook based training and inference. Just an example to practice notebook.
Shorten time to setup cluster. If people don't care the cluster setups, probably write a scripts to setup users and bring up a cluster
Users copy & paste output from instruction. I would suggest no to use exact same markdown for outputs. https://eksworkshop.com/advanced/420_kubeflow/pipelines/ For example, this is not something we want user to copy and run.