ExpressAI / DataLab

The unified platform for data-related resources.
https://expressai.github.io/DataLab/
Apache License 2.0
131 stars 27 forks source link

multilexsum, short dataset seems broken #396

Open neubig opened 1 year ago

neubig commented 1 year ago

When loading the multilexsum, short dataset I get the following error

Traceback (most recent call last):
  File "/Users/gneubig/work/DataLab/utils/get_dataset_info.py", line 168, in main
    metadata["splits"] = get_splits(file_name, sub_dataset)
  File "/Users/gneubig/work/DataLab/utils/get_dataset_info.py", line 33, in get_splits
    loaded = load_dataset("../datasets/" + dataset, sub_dataset)
  File "/Users/gneubig/work/DataLab/datalabs/load.py", line 2156, in load_dataset
    builder_instance.download_and_prepare(
  File "/Users/gneubig/work/DataLab/datalabs/builder.py", line 747, in download_and_prepare
    self._download_and_prepare(
  File "/Users/gneubig/work/DataLab/datalabs/builder.py", line 868, in _download_and_prepare
    self._prepare_split(split_generator, **prepare_split_kwargs)
  File "/Users/gneubig/work/DataLab/datalabs/builder.py", line 1330, in _prepare_split
    for key, record in utils.tqdm(
  File "/Users/gneubig/opt/anaconda3/envs/explainaboard_web/lib/python3.10/site-packages/tqdm/std.py", line 1183, in __iter__
    for obj in iterable:
  File "/Users/gneubig/.cache/expressai/modules/datasets_modules/datalab/multilexsum/605137a12e4b0c849402f0378769b57154a47bc609238408faae31d79b49315e/multilexsum.py", line 130, in _generate_examples
    summary = original_data["summary/{}".format(self.config.name)].replace("\n", "").strip()
AttributeError: 'NoneType' object has no attribute 'replace'