Open JohnTigue opened 4 years ago
10 Minutes to cuDF and Dask cuDF:
With Dask, anything you can do on a single GPU with cuDF you can scale out and in parallel across multiple GPUs. Fundamentally, an instance of a cudf.DataFrame object is a single partition of a distributed GPU DataFrame, managed by Dask. Distributing DataFrames and computation with Dask lets you analyze datasets far larger than a single GPU’s memory without running into out of memory errors. The RAPIDS team is working with Dask maintainers and developers to fully support the GPU DataFrame in Dask...
With cuDF and Dask, whether you’re using a single NVIDIA T4 GPU or using all eight NVIDIA V100 GPUs in a DGX-1, your RAPIDS workflow will run smoothly — intelligently distributing the workload across the resources available. Interested in giving it a try? You can quickly get started with RAPIDS on Google Colab
In other words, Colab is the free tier () level for code that can scale.
End to end on the GPU (purple block diagram): https://github.com/rapidsai/cudf
This is a staging place for stuff to add to the Reconstrue Handbook, for how to use these tools.
Single-cell articles
Single-cell "books"