shunk031 / huggingface-datasets_JGLUE

JGLUE: Japanese General Language Understanding Evaluation for huggingface datasets
https://huggingface.co/datasets/shunk031/JGLUE
9 stars 2 forks source link

download parquet from hf datasets in `MARC-ja` #10

Closed shunk031 closed 11 months ago

shunk031 commented 11 months ago

Temporary solution for #9.

It appears that huggingface's dataset server automatically converts datasets to parquet files (ref. List Parquet files https://huggingface.co/docs/datasets-server/parquet#list-parquet-files ). We can find the parquet format of the dataset in https://huggingface.co/datasets/shunk031/JGLUE/tree/refs%2Fconvert%2Fparquet. Our script will temporarily load the parquet files.