Faster partitioning with dask and xarray

There were two key bottlenecks in the Partition workflow, resolved as follows:

probabilistic_partition() iterated over rows in a pandas data frame -- much slower than column-wise operations. Refactoring to use a series of data frame merges in pandas_partition() and dask_partition() enables use of faster column-wise operations. The run_step() method is currently coded to use dask_partition(), which is fast enough to handle the larger Safegraph data. The pandas_partition() implementation would be preferred for smaller datasets.
contact_matrix() also iterates over rows in a pandas data frame to build an xarray. New method build_contact_xr() uses the pandas to_xarray() method to convert a multi-indexed pandas df directly to an xarray (with a bit of tidying up for conformance with test cases).

A rough bench mark is 90 min to partition one day of Safegraph data with the old probabilistic_partition() + contact_matrix() workflow, compared to 27.5 seconds with the dask_partition() + build_contact_xr() workflow on a macbook.

UT-Covid / episimlab

Faster partitioning with dask and xarray #24