coiled / data-science-at-scale

A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).
MIT License
112 stars 38 forks source link

Add env small for binder #36

Open ncclementi opened 2 years ago

ncclementi commented 2 years ago

If we check the prep.py it checks for the environment variable DASK_TUTORIAL_SMALL. The idea behind this is to use this when the tutorial is run on binder. This will actually set the env variable when launching binder instead of using the whole data.

For comparison, this is how it's set in the main dask-tutorial. https://github.com/dask/dask-tutorial/blob/main/binder/start

cc: @pavithraes