Open ncclementi opened 1 year ago
I found this to be very confusing, and not clear where things come from.
files = fs.glob("s3://prefect-dask-examples/nyc-uber-lyft/**/*.parquet", detail=True) files = [f"s3://{v['Key']}" for _,v in files.items() if v["type"] == "file"] files
has things from multiple folders, like test pipelines, and process_files
['s3://prefect-dask-examples/nyc-uber-lyft/processed_files/run-b367f36a-7fe5-11ed-bfda-0eb16f1fff2f.parquet/fhvhv_tripdata_cee0debf-7fe5-11ed-8000-0e2d60451ecb.parquet', 's3://prefect-dask-examples/nyc-uber-lyft/processed_files/run-cb471034-7fbc-11ed-bfd5-0e865f2f250d.parquet/fhvhv_tripdata_01d2a68a-7fbd-11ed-8000-0ef46db4ddb1.parquet', ... 's3://prefect-dask-examples/nyc-uber-lyft/test_pipeline/run-317fd0ea-7fb6-11ed-bdac-0e6d8b22635b.parquet/fhvhv_tripdata_001933f8-7fb9-11ed-81c9-0e6d8b22635b.parquet', 's3://prefect-dask-examples/nyc-uber-lyft/test_pipeline/run-317fd0ea-7fb6-11ed-bdac-0e6d8b22635b.parquet/fhvhv_tripdata_001c4b49-7fb9-11ed-8262-0ee02aa35f83.parquet',
test
I found this to be very confusing, and not clear where things come from.
has things from multiple folders, like test pipelines, and process_files