google-research / FLAN

Apache License 2.0
1.48k stars 156 forks source link

Issue generating flan2021 submix #87

Closed Bondrake closed 1 year ago

Bondrake commented 1 year ago

Hello, I have successfully generated other submixes, but for FLAN2021 I am getting stuck on this error:

ERROR:absl:Failed to load task 'bool_q_template_0to10_no_opt_zero_shot' as part of mixture 'flan2021_submix'
Traceback (most recent call last):
  File "flan/v2/flan_submix_gen.py", line 84, in <module>
    dataset = selected_mixture.get_dataset(
  File "/usr/local/lib/python3.8/dist-packages/seqio/dataset_providers.py", line 1805, in get_dataset
    ds = task.get_dataset(
  File "/usr/local/lib/python3.8/dist-packages/seqio/dataset_providers.py", line 1443, in get_dataset
    ds = source.get_dataset(
  File "/usr/local/lib/python3.8/dist-packages/seqio/dataset_providers.py", line 496, in get_dataset
    return self.tfds_dataset.load(
  File "/usr/local/lib/python3.8/dist-packages/seqio/utils.py", line 182, in load
    return tfds.load(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_datasets/core/logging/__init__.py", line 166, in __call__
    return function(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_datasets/core/load.py", line 639, in load
    _download_and_prepare_builder(dbuilder, download, download_and_prepare_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_datasets/core/load.py", line 498, in _download_and_prepare_builder
    dbuilder.download_and_prepare(**download_and_prepare_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_datasets/core/logging/__init__.py", line 166, in __call__
    return function(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_datasets/core/dataset_builder.py", line 691, in download_and_prepare
    self._download_and_prepare(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_datasets/core/dataset_builder.py", line 1583, in _download_and_prepare
    future = split_builder.submit_split_generation(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_datasets/core/split_builder.py", line 341, in submit_split_generation
    return self._build_from_generator(**build_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_datasets/core/split_builder.py", line 406, in _build_from_generator
    for key, example in utils.tqdm(
  File "/usr/local/lib/python3.8/dist-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_datasets/datasets/bool_q/bool_q_dataset_builder.py", line 86, in _generate_examples
    "question": row["question"],
KeyError: 'question'

I've tried on other systems and also using Python 3.10

Is this an issue with tfds / bool_q? What would the workaround be?

Bondrake commented 1 year ago

Sorry, this seems to be an issue with tfds using cached data that doesn't correspond to bool_q in /root/tensorflow_datasets/downloads/manual/