SuperCowPowers / sageworks

SageWorks: An easy to use Python API for creating and deploying AWS SageMaker Models
https://www.supercowpowers.com
MIT License
41 stars 1 forks source link

DataLoader: Heavy: Partitioning data based on date #329

Open brifordwylie opened 1 year ago

brifordwylie commented 1 year ago

Add functionality to the glue jobs that pull in data so that they partition the data based on year-mm-day. We'll need a bit of spark/dataframe code that computes the field and then sets that field as the partition key.

brifordwylie commented 1 year ago

Closely related to #318