Closed kianho closed 3 years ago
Thanks @kianho! Since we will use this as a base to build out the data management and ETL module you mind putting it in in the following location while we work out the logic for the cfn and step functions? Thanks so much.
├── notebooks
├── contrib
│ └── data <-- put it here!
├── modules
│ ├── environment
│ └── pipeline
Thanks @kianho! Since we will use this as a base to build out the data management and ETL module you mind putting it in in the following location while we work out the logic for the cfn and step functions? Thanks so much.
├── notebooks ├── contrib │ └── data <-- put it here! ├── modules │ ├── environment │ └── pipeline
@josiahdavis done
Issue #, if available: N/A
Description of changes:
Added a minimal example notebook of using sagemaker processing to process large, out-of-core datasets via pyspark.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.