Closed ekdnam closed 3 years ago
I am currently trying to load the ted_talks_iwslt dataset into google colab.
ted_talks_iwslt
The docs mention the following way of doing so.
dataset = load_dataset("ted_talks_iwslt", language_pair=("it", "pl"), year="2014")
Executing it results in the error attached below.
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-6-7dcc67154ef9> in <module>() ----> 1 dataset = load_dataset("ted_talks_iwslt", language_pair=("it", "pl"), year="2014") 4 frames /usr/local/lib/python3.7/dist-packages/datasets/load.py in load_dataset(path, name, data_dir, data_files, split, cache_dir, features, download_config, download_mode, ignore_verifications, keep_in_memory, save_infos, script_version, use_auth_token, **config_kwargs) 730 hash=hash, 731 features=features, --> 732 **config_kwargs, 733 ) 734 /usr/local/lib/python3.7/dist-packages/datasets/builder.py in __init__(self, writer_batch_size, *args, **kwargs) 927 928 def __init__(self, *args, writer_batch_size=None, **kwargs): --> 929 super(GeneratorBasedBuilder, self).__init__(*args, **kwargs) 930 # Batch size used by the ArrowWriter 931 # It defines the number of samples that are kept in memory before writing them /usr/local/lib/python3.7/dist-packages/datasets/builder.py in __init__(self, cache_dir, name, hash, features, **config_kwargs) 241 name, 242 custom_features=features, --> 243 **config_kwargs, 244 ) 245 /usr/local/lib/python3.7/dist-packages/datasets/builder.py in _create_builder_config(self, name, custom_features, **config_kwargs) 337 if "version" not in config_kwargs and hasattr(self, "VERSION") and self.VERSION: 338 config_kwargs["version"] = self.VERSION --> 339 builder_config = self.BUILDER_CONFIG_CLASS(**config_kwargs) 340 341 # otherwise use the config_kwargs to overwrite the attributes /root/.cache/huggingface/modules/datasets_modules/datasets/ted_talks_iwslt/024d06b1376b361e59245c5878ab8acf9a7576d765f2d0077f61751158e60914/ted_talks_iwslt.py in __init__(self, language_pair, year, **kwargs) 219 description=description, 220 version=datasets.Version("1.1.0", ""), --> 221 **kwargs, 222 ) 223 TypeError: __init__() got multiple values for keyword argument 'version'
How to resolve this?
PS: Thanks a lot @huggingface team for creating this great library!
@skyprince999 as you authored the PR for this dataset, any comments?
This has been fixed in #2064 by @mariosasko (thanks again !)
The fix is available on the master branch and we'll do a new release very soon :)
I am currently trying to load the
ted_talks_iwslt
dataset into google colab.The docs mention the following way of doing so.
Executing it results in the error attached below.
How to resolve this?
PS: Thanks a lot @huggingface team for creating this great library!