Watts-Lab / daphme

Data Access Platform for Human Mobility in Epidemiology
0 stars 0 forks source link

Tests for coarse filtering by geography, time window, and completeness #20

Open GolanTrev opened 2 weeks ago

GolanTrev commented 1 week ago

This script, more generally, should be the FIRST step in the pipeline in which you transform the data in a meaningful way. This includes re-projecting and generating new columns like "timestamp", and the user would probably rewrite the processed data into a new folder.

Apache Sedona? Spark (SQL) for subsetting dates.

What I would expect, is to use daphmeio to handle anything related to column names, folder structure, S3, data types, and WRITING to file. So, functions should receive an optional dict as parameter that helps find alternate col_names.