Closed albertvillanova closed 6 months ago
Finally:
there is a different issue now: https://github.com/huggingface/datasets/actions/runs/8627153993/job/23646584590?pr=6797
FAILED tests/test_load.py::ModuleFactoryTest::test_HubDatasetModuleFactoryWithParquetExport - datasets.utils._dataset_viewer.DatasetViewerError: No exported Parquet files available.
FAILED tests/test_load.py::ModuleFactoryTest::test_HubDatasetModuleFactoryWithParquetExport_errors_on_wrong_sha - datasets.utils._dataset_viewer.DatasetViewerError: No exported Parquet files available.
FAILED tests/test_load.py::test_load_dataset_builder_for_community_dataset_with_script - AssertionError: assert 'dataset_with_script' == 'parquet'
- parquet
+ dataset_with_script
Maybe related to hf-internal-testing/dataset_with_script
dataset: https://huggingface.co/datasets/hf-internal-testing/dataset_with_script
This URL: https://datasets-server.huggingface.co/parquet?dataset=hf-internal-testing/dataset_with_script raises:
{"error":"The dataset viewer doesn't support this dataset because it runs arbitrary python code. Please open a discussion in the discussion tab if you think this is an error and tag @lhoestq and @severo."}
Was there a recent change on the Hub enforcing this behavior?
OK, I just saw this PR:
Once merged and deployed, it should fix the issue.
Once the script-dataset has been allowed in the dataset-viewer, we should fix our test to make the CI pass.
I am addressing this.
CI is broken for test_load_dataset_distributed_with_script. See: https://github.com/huggingface/datasets/actions/runs/8614926216/job/23609378127