config-parquet-metadata step is failing again for FineWeb with errors like
"Could not read the parquet files: 504, message='Gateway Time-out', url=URL('https://huggingface.co/datasets/HuggingFaceFW/fineweb/resolve/refs%2Fconvert%2Fparquet/default/train-part1/4089.parquet')"
Maybe this would help (the same is used in config-parquet-and-info step which works).
config-parquet-metadata
step is failing again for FineWeb with errors likeMaybe this would help (the same is used in
config-parquet-and-info
step which works).Previous fix was https://github.com/huggingface/dataset-viewer/pull/2884