Closed martindurant closed 3 years ago
In the absence of a _metadata file, dask presumes all files in the directory are data files ( https://github.com/dask/dask/blob/main/dask/dataframe/io/parquet/fastparquet.py#L169 ). It should do the same as fastparquet, and filter for .parq, .parquet and _metadata*.
@martindurant - thanks for pointing me in the right direction. I was trying to figure this out in the code and didn't come across the line you pointed out. I will try again!
In the absence of a _metadata file, dask presumes all files in the directory are data files ( https://github.com/dask/dask/blob/main/dask/dataframe/io/parquet/fastparquet.py#L169 ). It should do the same as fastparquet, and filter for .parq, .parquet and _metadata*.