ncclementi / Dissertation

This is where I start putting together what will we call a dissertation.
0 stars 1 forks source link

test #2

Open ncclementi opened 1 year ago

ncclementi commented 1 year ago

I found this to be very confusing, and not clear where things come from.

files = fs.glob("s3://prefect-dask-examples/nyc-uber-lyft/**/*.parquet", detail=True)
files = [f"s3://{v['Key']}" for _,v in files.items() if v["type"] == "file"]
files

has things from multiple folders, like test pipelines, and process_files

['s3://prefect-dask-examples/nyc-uber-lyft/processed_files/run-b367f36a-7fe5-11ed-bfda-0eb16f1fff2f.parquet/fhvhv_tripdata_cee0debf-7fe5-11ed-8000-0e2d60451ecb.parquet',
 's3://prefect-dask-examples/nyc-uber-lyft/processed_files/run-cb471034-7fbc-11ed-bfd5-0e865f2f250d.parquet/fhvhv_tripdata_01d2a68a-7fbd-11ed-8000-0ef46db4ddb1.parquet',
 ...

's3://prefect-dask-examples/nyc-uber-lyft/test_pipeline/run-317fd0ea-7fb6-11ed-bdac-0e6d8b22635b.parquet/fhvhv_tripdata_001933f8-7fb9-11ed-81c9-0e6d8b22635b.parquet',
 's3://prefect-dask-examples/nyc-uber-lyft/test_pipeline/run-317fd0ea-7fb6-11ed-bdac-0e6d8b22635b.parquet/fhvhv_tripdata_001c4b49-7fb9-11ed-8262-0ee02aa35f83.parquet',
ncclementi commented 2 months ago

test