rapidsai / deployment

RAPIDS Deployment Documentation
https://docs.rapids.ai/deployment/stable/
9 stars 28 forks source link

Release 24.08 checklist #402

Closed jacobtomlinson closed 1 month ago

jacobtomlinson commented 1 month ago

Release checklist

For the upcoming release we need to verify our documentation. This is a best efforts activity so please refer to the checklist from the previous release and focus on pages that were not verified last time.

Verify pages

### Index/Non-technical
- [x] https://docs.rapids.ai/deployment/nightly/
- [x] https://docs.rapids.ai/deployment/nightly/cloud
- [x] https://docs.rapids.ai/deployment/nightly/cloud/aws
- [x] https://docs.rapids.ai/deployment/nightly/cloud/azure
- [x] https://docs.rapids.ai/deployment/nightly/cloud/gcp
- [x] https://docs.rapids.ai/deployment/nightly/cloud/ibm
- [x] https://docs.rapids.ai/deployment/nightly/cloud/nvidia
- [x] https://docs.rapids.ai/deployment/nightly/examples
- [x] https://docs.rapids.ai/deployment/nightly/guides
- [x] https://docs.rapids.ai/deployment/nightly/hpc
- [x] https://docs.rapids.ai/deployment/nightly/local
- [x] https://docs.rapids.ai/deployment/nightly/platforms
- [x] https://docs.rapids.ai/deployment/nightly/tools
### P0
- [x] https://docs.rapids.ai/deployment/nightly/cloud/aws/sagemaker
- [x] https://docs.rapids.ai/deployment/nightly/cloud/azure/azureml
- [x] https://docs.rapids.ai/deployment/nightly/cloud/gcp/vertex-ai
- [x] https://docs.rapids.ai/deployment/nightly/cloud/nvidia/bcp
- [x] https://docs.rapids.ai/deployment/nightly/platforms/colab
- [x] https://docs.rapids.ai/deployment/nightly/platforms/databricks (single-node)
- [x] https://docs.rapids.ai/deployment/nightly/platforms/databricks (multi-node)
### P1
- [x] https://docs.rapids.ai/deployment/nightly/cloud/aws/ec2
- [ ] https://docs.rapids.ai/deployment/nightly/cloud/aws/eks
- [x] https://docs.rapids.ai/deployment/nightly/cloud/azure/aks
- [x] https://docs.rapids.ai/deployment/nightly/cloud/azure/azure-vm
- [x] https://docs.rapids.ai/deployment/nightly/cloud/gcp/compute-engine
- [x] https://docs.rapids.ai/deployment/nightly/cloud/gcp/gke
### P2
- [ ] https://docs.rapids.ai/deployment/nightly/cloud/aws/ec2-multi
- [ ] https://docs.rapids.ai/deployment/nightly/cloud/aws/ecs
- [ ] https://docs.rapids.ai/deployment/nightly/cloud/azure/azure-vm-multi
- [ ] https://docs.rapids.ai/deployment/nightly/cloud/gcp/dataproc
- [ ] https://docs.rapids.ai/deployment/nightly/cloud/ibm/virtual-server
- [ ] https://docs.rapids.ai/deployment/nightly/examples/rapids-1brc-single-node/notebook
- [ ] https://docs.rapids.ai/deployment/nightly/examples/rapids-autoscaling-multi-tenant-kubernetes/notebook
- [ ] https://docs.rapids.ai/deployment/nightly/examples/rapids-azureml-hpo/notebook
- [x] https://docs.rapids.ai/deployment/nightly/examples/rapids-ec2-mnmg/notebook
- [x] https://docs.rapids.ai/deployment/nightly/examples/rapids-optuna-hpo/notebook (https://github.com/rapidsai/deployment/pull/404)
- [ ] https://docs.rapids.ai/deployment/nightly/examples/rapids-sagemaker-higgs/notebook
- [ ] https://docs.rapids.ai/deployment/nightly/examples/rapids-sagemaker-hpo/notebook
- [ ] https://docs.rapids.ai/deployment/nightly/examples/time-series-forecasting-with-hpo/notebook
- [ ] https://docs.rapids.ai/deployment/nightly/examples/xgboost-azure-mnmg-daskcloudprovider/notebook
- [ ] https://docs.rapids.ai/deployment/nightly/examples/xgboost-dask-databricks/notebook
- [ ] https://docs.rapids.ai/deployment/nightly/examples/xgboost-gpu-hpo-job-parallel-k8s/notebook
- [ ] https://docs.rapids.ai/deployment/nightly/examples/xgboost-gpu-hpo-job-parallel-ngc/notebook
- [ ] https://docs.rapids.ai/deployment/nightly/examples/xgboost-gpu-hpo-mnmg-parallel-k8s/notebook
- [ ] https://docs.rapids.ai/deployment/nightly/examples/xgboost-randomforest-gpu-hpo-dask/notebook
- [ ] https://docs.rapids.ai/deployment/nightly/examples/xgboost-rf-gpu-cpu-benchmark/notebook
- [ ] https://docs.rapids.ai/deployment/nightly/guides/azure/infiniband
- [x] https://docs.rapids.ai/deployment/nightly/guides/colocate-workers
- [ ] https://docs.rapids.ai/deployment/nightly/guides/l4-gcp
- [x] https://docs.rapids.ai/deployment/nightly/guides/mig
- [x] https://docs.rapids.ai/deployment/nightly/guides/scheduler-gpu-optimization
- [x] https://docs.rapids.ai/deployment/nightly/guides/scheduler-gpu-requirements
- [ ] https://docs.rapids.ai/deployment/nightly/platforms/coiled
- [ ] https://docs.rapids.ai/deployment/nightly/platforms/kserve
- [x] https://docs.rapids.ai/deployment/nightly/platforms/kubeflow
- [ ] https://docs.rapids.ai/deployment/nightly/platforms/kubernetes
- [x] https://docs.rapids.ai/deployment/nightly/tools/dask-cuda
- [ ] https://docs.rapids.ai/deployment/nightly/tools/kubernetes/dask-helm-chart
- [ ] https://docs.rapids.ai/deployment/nightly/tools/kubernetes/dask-operator
- [x] https://docs.rapids.ai/deployment/nightly/tools/rapids-docker

_Issue text generated by scripts/gen_release_checklistissue.py.

jameslamb commented 1 month ago

https://docs.rapids.ai/deployment/nightly/cloud/aws/sagemaker

This is ok. All the links work, there are no typos, the AWS UI hasn't changed in ways that required re-generating any screenshots.

I've put up #408 with some proposed simplifications,

https://docs.rapids.ai/deployment/nightly/examples/rapids-sagemaker-higgs/notebook

This is linked to from the SageMaker page and is a great end-to-end demo of how to use RAPIDS with SageMaker notebooks and SageMaker Estimators.

I wasn't able to run this end-to-end today, but did test it thoroughly in the last release.

jameslamb commented 1 month ago

https://docs.rapids.ai/deployment/nightly/cloud/aws/eks

I was not able to get this example working during this release cycle's testing. Ran into some issues with the gpu-operator.

I've opened https://github.com/rapidsai/deployment/issues/409 to track that. I'll have MUCH more documentation to add there shortly, but wanted to at least get it open for people to subscribe to and link against.

jacobtomlinson commented 1 month ago

All P0/P1 pages have been tested (witht he exception of EKS which needs a longer term follow up in #409). Closing this out for 24.08.