epwalsh / nlp-models

NLP research experiments, built on PyTorch within the AllenNLP framework.
MIT License
91 stars 9 forks source link

allennlp.common.checks.ConfigurationError: key "data_loader" is required #45

Open MJ2468 opened 2 years ago

MJ2468 commented 2 years ago

Hi, when i reasearch copynet, I found your copynet mechanism and I thought it is very useful. So I tried to use it on the colab, but now I get a issue about the configurationError. I've tried to enter 'experiments/greetings/copynet.json' and '/content/nlp-models/experiments/greetings/copynet.json' to fit the colab environment. Also, tried to the path to train and validation data in experiments/greetings/copynet.json because of the colab. However, I've got all same issue. it is your post what I refer to. https://medium.com/@epwalsh10/incorporating-a-copy-mechanism-into-sequence-to-sequence-models-40917280b89d

Could you tell me what can i do?

It is my process.

!pip install pipenv !pip install allennlp !pip install overrides==3.1.0 !pip install scipy==1.7.3 !pip install allennlp-models !git clone https://github.com/epwalsh/nlp-models.git cd nlp-models/ !make data/greetings.tar.gz !make train

(it is error) Enter model file: /content/nlp-models/experiments/greetings/copynet.json Enter model directory (default is /tmp/models/copynet): Detected 8 previous runs. Serializing model to /tmp/models/copynet/run_009 Is this correct? [Y/n] Y 2022-07-27 04:17:56,673 - INFO - allennlp.common.plugins - Plugin allennlp_models available 2022-07-27 04:17:56,736 - INFO - allennlp.common.params - evaluation = None 2022-07-27 04:17:56,737 - INFO - allennlp.common.params - include_in_archive = None 2022-07-27 04:17:56,737 - INFO - allennlp.common.params - random_seed = 13370 2022-07-27 04:17:56,737 - INFO - allennlp.common.params - numpy_seed = 1337 2022-07-27 04:17:56,737 - INFO - allennlp.common.params - pytorch_seed = 133 2022-07-27 04:17:56,738 - INFO - allennlp.common.checks - Pytorch version: 1.11.0+cu102 2022-07-27 04:17:56,739 - INFO - allennlp.common.params - type = default 2022-07-27 04:17:56,740 - INFO - allennlp.common.params - dataset_reader.type = copynet_seq2seq 2022-07-27 04:17:56,740 - INFO - allennlp.common.params - dataset_reader.max_instances = None 2022-07-27 04:17:56,740 - INFO - allennlp.common.params - dataset_reader.manual_distributed_sharding = False 2022-07-27 04:17:56,740 - INFO - allennlp.common.params - dataset_reader.manual_multiprocess_sharding = False 2022-07-27 04:17:56,741 - INFO - allennlp.common.params - dataset_reader.target_namespace = target_tokens 2022-07-27 04:17:56,741 - INFO - allennlp.common.params - dataset_reader.source_tokenizer = None 2022-07-27 04:17:56,741 - INFO - allennlp.common.params - dataset_reader.target_tokenizer = None 2022-07-27 04:17:56,741 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.token_characters.type = characters 2022-07-27 04:17:56,741 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.token_characters.namespace = token_characters 2022-07-27 04:17:56,741 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.token_characters.character_tokenizer = <allennlp.data.tokenizers.character_tokenizer.CharacterTokenizer object at 0x7f4c2068c8d0> 2022-07-27 04:17:56,742 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.token_characters.start_tokens = None 2022-07-27 04:17:56,742 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.token_characters.end_tokens = None 2022-07-27 04:17:56,742 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.token_characters.min_padding_length = 0 2022-07-27 04:17:56,742 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.token_characters.token_min_padding_length = 0 /usr/local/lib/python3.7/dist-packages/allennlp/data/token_indexers/token_characters_indexer.py:60: UserWarning: You are using the default value (0) of min_padding_length, which can cause some subtle bugs (more info see https://github.com/allenai/allennlp/issues/1954). Strongly recommend to set a value, usually the maximum size of the convolutional layer size when using CnnEncoder. UserWarning, 2022-07-27 04:17:56,742 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.tokens.type = single_id 2022-07-27 04:17:56,742 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.tokens.namespace = source_tokens 2022-07-27 04:17:56,743 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.tokens.lowercase_tokens = False 2022-07-27 04:17:56,743 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.tokens.start_tokens = None 2022-07-27 04:17:56,743 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.tokens.end_tokens = None 2022-07-27 04:17:56,743 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.tokens.feature_name = text 2022-07-27 04:17:56,743 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.tokens.default_value = THIS IS A REALLY UNLIKELY VALUE THAT HAS TO BE A STRING 2022-07-27 04:17:56,743 - INFO - allennlp.common.params - dataset_reader.source_token_indexers.tokens.token_min_padding_length = 0 /usr/local/lib/python3.7/dist-packages/spacy/util.py:837: UserWarning: [W095] Model 'en_core_web_sm' (3.4.0) was trained with spaCy v3.4 and may not be 100% compatible with the current version (3.3.1). If you see errors or degraded performance, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate warnings.warn(warn_msg) 2022-07-27 04:17:57,369 - INFO - allennlp.common.params - train_data_path = /content/nlp-models/data/greetings/train.tsv 2022-07-27 04:17:57,370 - CRITICAL - root - Uncaught exception Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/allennlp/common/params.py", line 211, in pop value = self.params.pop(key) KeyError: 'data_loader'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/bin/allennlp", line 8, in sys.exit(run()) File "/usr/local/lib/python3.7/dist-packages/allennlp/main.py", line 39, in run main(prog="allennlp") File "/usr/local/lib/python3.7/dist-packages/allennlp/commands/init.py", line 120, in main args.func(args) File "/usr/local/lib/python3.7/dist-packages/allennlp/commands/train.py", line 120, in train_model_from_args file_friendly_logging=args.file_friendly_logging, File "/usr/local/lib/python3.7/dist-packages/allennlp/commands/train.py", line 186, in train_model_from_file return_model=return_model, File "/usr/local/lib/python3.7/dist-packages/allennlp/commands/train.py", line 264, in train_model file_friendly_logging=file_friendly_logging, File "/usr/local/lib/python3.7/dist-packages/allennlp/commands/train.py", line 498, in _train_worker ddp_accelerator=ddp_accelerator, File "/usr/local/lib/python3.7/dist-packages/allennlp/common/from_params.py", line 608, in from_params extras, File "/usr/local/lib/python3.7/dist-packages/allennlp/common/from_params.py", line 636, in from_params kwargs = create_kwargs(constructor_to_inspect, cls, params, extras) File "/usr/local/lib/python3.7/dist-packages/allennlp/common/from_params.py", line 207, in create_kwargs cls.name, param_name, annotation, param.default, params, **extras File "/usr/local/lib/python3.7/dist-packages/allennlp/common/from_params.py", line 310, in pop_and_construct_arg popped_params = params.pop(name, default) if default != _NO_DEFAULT else params.pop(name) File "/usr/local/lib/python3.7/dist-packages/allennlp/common/params.py", line 216, in pop raise ConfigurationError(msg) allennlp.common.checks.ConfigurationError: key "data_loader" is required ✗ Training job /content/nlp-models/experiments/greetings/copynet.json failed. Total time: 0 hours, 0 minutes and 8 seconds elapsed.

epwalsh commented 2 years ago

Hey @MJ2468, this version of CopyNet was built to run with an older version of AllenNLP. Since then I've ported this implementation into the official AllenNLP Models repository. I'd recommend you use that one. https://github.com/allenai/allennlp-models