google-research / FLAN

Apache License 2.0
1.47k stars 155 forks source link

seqio.get_mixture_or_task('ag_news_subset_template_0_five_shot') failed #72

Open liuzhiyong01 opened 1 year ago

liuzhiyong01 commented 1 year ago

python script: selected_mixture = seqio.get_mixture_or_task('ag_news_subset_template_0_five_shot') INPUT_SEQ_LEN = 2056 TARGET_SEQ_LEN = 512

dataset = selected_mixture.get_dataset( sequence_length={"inputs": INPUT_SEQ_LEN, "targets": TARGET_SEQ_LEN},

split="train",

shuffle=True,
num_epochs=1,
# shard_info=seqio.ShardInfo(index=0, num_shards=10),
use_cached=False,
seed=42

) for i, ex in enumerate(dataset.take(10)): print(ex)

this script will report errors: image

but when i replace selected_mixture = seqio.get_mixture_or_task('ag_news_subset_template_0_five_shot') with selected_mixture = seqio.get_mixture_or_task('ag_news_subset_template_mix_five_shot') the result is success, why?

shayne-longpre commented 1 year ago

I'd have to review this. I think there may be a but in ag_news somewhere.

In the meantime, if you are just interested in downloading the final generated set, we now link to them in the README! :) Hopefully this circumvents your issue?