This PR includes a notebook that creates df1 (dataset) using coiled following @CerebralMastication code.
We have them now on S3, it is not public data yet as we might want to keep improving this but I'd be happy to put it somewhere public, if we think this is it.
It also includes a notebook, where I run the set_index (step_1) and bigjoin (step_2) operations that were in the original workflow, and I'm including the performance reports too. (I'll write up a follow-up issue to link behavior)
This PR includes a notebook that creates df1 (dataset) using coiled following @CerebralMastication code.
We have them now on S3, it is not public data yet as we might want to keep improving this but I'd be happy to put it somewhere public, if we think this is it.
It also includes a notebook, where I run the set_index (step_1) and bigjoin (step_2) operations that were in the original workflow, and I'm including the performance reports too. (I'll write up a follow-up issue to link behavior)