### Prep tasks
- [x] Choose a workload (xgboost training, cuml training)
- [x] Put a dataset into Delta Lake
- [x] Figure out how to read with dask-deltatable into a dask-cudf dataframe
Workflow structure
Option 1 (goal):
Start MNMG Dask/RAPIDS cluster on Databricks (what we have in the docs now)
Read dataset from Delta Lake with dask-deltatable into a dask-cudf dataframe
Do some preprocessing
Train a model
Summary
Option 2 (stretch goal):
Start MNMG Dask/RAPIDS cluster on Databricks (what we have in the docs now)
Generate data somehow in a GPU intensive way (custreamz)
Workflow structure
Option 1 (goal):
dask-deltatable
into adask-cudf
dataframeOption 2 (stretch goal):
Related links: