Closed AutoTemp closed 4 years ago
Hi @AutoTemp, thank you for the detailed bug report :)
Could you tell me if that exception was raised before or after the DistributedDataParallel
workers were spawned? Or if you could just share your training log output, I should be able to tell.
Hi @AutoTemp, thank you for the detailed bug report :)
Could you tell me if that exception was raised before or after the
DistributedDataParallel
workers were spawned? Or if you could just share your training log output, I should be able to tell.
For the above example, the outputs are as follow:
```
2020-07-01 08:00:58,070 - INFO - transformers.file_utils - PyTorch version 1.5.1 available.
2020-07-01 08:00:59,169 - INFO - root - Switching to distributed training mode since multiple GPUs are configuredMaster is at: 1
27.0.0.1:29500 | Rank of this node: 0 | Number of workers in this node: 2 | Number of nodes: 1 | World size: 2
2020-07-01 08:00:59,169 - INFO - allennlp.common.params - datasets_for_vocab_creation = None
2020-07-01 08:00:59,169 - INFO - allennlp.common.params - dataset_reader.type = classification-tsv
2020-07-01 08:00:59,170 - INFO - allennlp.common.params - dataset_reader.lazy = True
2020-07-01 08:00:59,170 - INFO - allennlp.common.params - dataset_reader.cache_directory = None
2020-07-01 08:00:59,170 - INFO - allennlp.common.params - dataset_reader.max_instances = None
2020-07-01 08:00:59,170 - INFO - allennlp.common.params - dataset_reader.manual_distributed_sharding = False
2020-07-01 08:00:59,170 - INFO - allennlp.common.params - dataset_reader.manual_multi_process_sharding = False
2020-07-01 08:00:59,170 - INFO - allennlp.common.params - dataset_reader.tokenizer = None
2020-07-01 08:00:59,170 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.type = single_id
2020-07-01 08:00:59,170 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.namespace = tokens
2020-07-01 08:00:59,171 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.lowercase_tokens = False
2020-07-01 08:00:59,171 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.start_tokens = None
2020-07-01 08:00:59,171 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.end_tokens = None
2020-07-01 08:00:59,171 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.feature_name = text
2020-07-01 08:00:59,171 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.default_value = THIS IS A REALLY
UNLIKELY VALUE THAT HAS TO BE A STRING
2020-07-01 08:00:59,171 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.token_min_padding_length = 0
2020-07-01 08:00:59,171 - INFO - allennlp.common.params - dataset_reader.max_tokens = None
2020-07-01 08:00:59,171 - INFO - allennlp.common.params - train_data_path = data/movie_review/train.tsv
2020-07-01 08:00:59,171 - INFO - allennlp.training.util - Reading training data from data/movie_review/train.tsv
2020-07-01 08:00:59,171 - INFO - allennlp.common.params - validation_dataset_reader = None
2020-07-01 08:00:59,171 - INFO - allennlp.common.params - validation_data_path = data/movie_review/dev.tsv
2020-07-01 08:00:59,171 - INFO - allennlp.training.util - Reading validation data from data/movie_review/dev.tsv
2020-07-01 08:00:59,171 - INFO - allennlp.common.params - vocabulary.type = from_instances
2020-07-01 08:00:59,172 - INFO - allennlp.common.params - vocabulary.min_count = None
2020-07-01 08:00:59,172 - INFO - allennlp.common.params - vocabulary.max_vocab_size = None
2020-07-01 08:00:59,172 - INFO - allennlp.common.params - vocabulary.non_padded_namespaces = ('*tags', '*labels')
2020-07-01 08:00:59,172 - INFO - allennlp.common.params - vocabulary.pretrained_files = None
2020-07-01 08:00:59,172 - INFO - allennlp.common.params - vocabulary.only_include_pretrained_words = False
2020-07-01 08:00:59,172 - INFO - allennlp.common.params - vocabulary.tokens_to_add = None
2020-07-01 08:00:59,172 - INFO - allennlp.common.params - vocabulary.min_pretrained_embeddings = None
2020-07-01 08:00:59,172 - INFO - allennlp.common.params - vocabulary.padding_token = @@PADDING@@
2020-07-01 08:00:59,172 - INFO - allennlp.common.params - vocabulary.oov_token = @@UNKNOWN@@
2020-07-01 08:00:59,172 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
1600it [00:01, 834.85it/s]
200it [00:00, 793.34it/s]]
1800it [00:02, 829.82it/s]
2020-07-01 08:01:01,406 - INFO - allennlp.training.util - writing the vocabulary to demo/vocabulary.
2020-07-01 08:01:01,407 - INFO - filelock - Lock 140568736572080 acquired on demo/vocabulary/.lock
2020-07-01 08:01:01,499 - INFO - filelock - Lock 140568736572080 released on demo/vocabulary/.lock
2020-07-01 08:01:01,499 - INFO - allennlp.training.util - done creating vocab
2020-07-01 08:01:02,293 - INFO - transformers.file_utils - PyTorch version 1.5.1 available.
2020-07-01 08:01:02,300 - INFO - transformers.file_utils - PyTorch version 1.5.1 available.
2020-07-01 08:01:02,996 - INFO - allennlp.common.params - Rank 0 | random_seed = 13370
2020-07-01 08:01:02,996 - INFO - allennlp.common.params - Rank 0 | numpy_seed = 1337
2020-07-01 08:01:02,996 - INFO - allennlp.common.params - Rank 0 | pytorch_seed = 133
2020-07-01 08:01:03,108 - INFO - allennlp.common.checks - Rank 0 | Pytorch version: 1.5.1
2020-07-01 08:01:03,150 - INFO - root - Rank 0 | Process group of world size 2 initialized for distributed training in worker 0
2020-07-01 08:01:03,150 - INFO - allennlp.common.params - Rank 0 | type = default
2020-07-01 08:01:03,151 - INFO - allennlp.common.params - Rank 0 | dataset_reader.type = classification-tsv
2020-07-01 08:01:03,151 - INFO - allennlp.common.params - Rank 0 | dataset_reader.lazy = True
2020-07-01 08:01:03,151 - INFO - allennlp.common.params - Rank 0 | dataset_reader.cache_directory = None
2020-07-01 08:01:03,151 - INFO - allennlp.common.params - Rank 0 | dataset_reader.max_instances = None
2020-07-01 08:01:03,151 - INFO - allennlp.common.params - Rank 0 | dataset_reader.manual_distributed_sharding = False
2020-07-01 08:01:03,151 - INFO - allennlp.common.params - Rank 0 | dataset_reader.manual_multi_process_sharding = False
2020-07-01 08:01:03,151 - INFO - allennlp.common.params - Rank 0 | dataset_reader.tokenizer = None
2020-07-01 08:01:03,152 - INFO - allennlp.common.params - Rank 0 | dataset_reader.token_indexers.tokens.type = single_id
2020-07-01 08:01:03,152 - INFO - allennlp.common.params - Rank 0 | dataset_reader.token_indexers.tokens.namespace = tokens
2020-07-01 08:01:03,152 - INFO - allennlp.common.params - Rank 0 | dataset_reader.token_indexers.tokens.lowercase_tokens = False
2020-07-01 08:01:03,152 - INFO - allennlp.common.params - Rank 0 | dataset_reader.token_indexers.tokens.start_tokens = None
2020-07-01 08:01:03,152 - INFO - allennlp.common.params - Rank 0 | dataset_reader.token_indexers.tokens.end_tokens = None
2020-07-01 08:01:03,152 - INFO - allennlp.common.params - Rank 0 | dataset_reader.token_indexers.tokens.feature_name = text
2020-07-01 08:01:03,152 - INFO - allennlp.common.params - Rank 0 | dataset_reader.token_indexers.tokens.default_value = THIS IS
A REALLY UNLIKELY VALUE THAT HAS TO BE A STRING
2020-07-01 08:01:03,152 - INFO - allennlp.common.params - Rank 0 | dataset_reader.token_indexers.tokens.token_min_padding_length
= 0
2020-07-01 08:01:03,152 - INFO - allennlp.common.params - Rank 0 | dataset_reader.max_tokens = None
2020-07-01 08:01:03,152 - INFO - allennlp.common.params - Rank 0 | train_data_path = data/movie_review/train.tsv
2020-07-01 08:01:03,153 - INFO - allennlp.common.params - Rank 0 | datasets_for_vocab_creation = None
2020-07-01 08:01:03,153 - INFO - allennlp.common.params - Rank 0 | validation_dataset_reader = None
2020-07-01 08:01:03,153 - INFO - allennlp.common.params - Rank 0 | validation_data_path = data/movie_review/dev.tsv
2020-07-01 08:01:03,153 - INFO - allennlp.common.params - Rank 0 | validation_data_loader = None
2020-07-01 08:01:03,153 - INFO - allennlp.common.params - Rank 0 | test_data_path = None
2020-07-01 08:01:03,153 - INFO - allennlp.common.params - Rank 0 | evaluate_on_test = False
2020-07-01 08:01:03,153 - INFO - allennlp.common.params - Rank 0 | batch_weight_key =
2020-07-01 08:01:03,153 - INFO - allennlp.training.util - Rank 0 | Reading training data from data/movie_review/train.tsv
2020-07-01 08:01:03,153 - INFO - allennlp.training.util - Rank 0 | Reading validation data from data/movie_review/dev.tsv
2020-07-01 08:01:03,153 - INFO - allennlp.common.params - Rank 0 | vocabulary.type = from_files
2020-07-01 08:01:03,154 - INFO - allennlp.common.params - Rank 0 | vocabulary.directory = demo/vocabulary
2020-07-01 08:01:03,154 - INFO - allennlp.common.params - Rank 0 | vocabulary.padding_token = @@PADDING@@
2020-07-01 08:01:03,154 - INFO - allennlp.common.params - Rank 0 | vocabulary.oov_token = @@UNKNOWN@@
2020-07-01 08:01:03,154 - INFO - allennlp.data.vocabulary - Rank 0 | Loading token dictionary from demo/vocabulary.
2020-07-01 08:01:03,354 - INFO - filelock - Rank 0 | Lock 140121523643560 acquired on demo/vocabulary/.lock
2020-07-01 08:01:03,397 - INFO - filelock - Rank 0 | Lock 140121523643560 released on demo/vocabulary/.lock
2020-07-01 08:01:03,398 - INFO - allennlp.common.params - Rank 0 | model.type = simple_classifier
2020-07-01 08:01:03,398 - INFO - allennlp.common.params - Rank 0 | model.embedder.type = basic
2020-07-01 08:01:03,398 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.type = embedding
2020-07-01 08:01:03,399 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.embedding_dim = 10
2020-07-01 08:01:03,399 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.num_embeddings = None
2020-07-01 08:01:03,399 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.projection_dim = None
2020-07-01 08:01:03,399 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.weight = None
2020-07-01 08:01:03,400 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.padding_index = None
2020-07-01 08:01:03,400 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.trainable = True
2020-07-01 08:01:03,400 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.max_norm = None
2020-07-01 08:01:03,400 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.norm_type = 2.0
2020-07-01 08:01:03,401 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.scale_grad_by_freq = Fa
lse
2020-07-01 08:01:03,401 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.sparse = False
2020-07-01 08:01:03,401 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.vocab_namespace = token
s
2020-07-01 08:01:03,401 - INFO - allennlp.common.params - Rank 0 | model.embedder.token_embedders.tokens.pretrained_file = None
2020-07-01 08:01:03,405 - INFO - allennlp.common.params - Rank 0 | model.encoder.type = bag_of_embeddings
2020-07-01 08:01:03,406 - INFO - allennlp.common.params - Rank 0 | model.encoder.embedding_dim = 10
2020-07-01 08:01:03,406 - INFO - allennlp.common.params - Rank 0 | model.encoder.averaged = False
2020-07-01 08:01:03,407 - WARNING - root - Rank 0 | vocabulary serialization directory demo/vocabulary is not empty
2020-07-01 08:01:03,407 - INFO - filelock - Rank 0 | Lock 140121523644288 acquired on demo/vocabulary/.lock
2020-07-01 08:01:03,497 - INFO - filelock - Rank 0 | Lock 140121523644288 released on demo/vocabulary/.lock
2020-07-01 08:01:03,497 - INFO - allennlp.common.params - Rank 0 | data_loader.type = default
2020-07-01 08:01:03,497 - INFO - allennlp.common.params - Rank 0 | data_loader.batch_size = 8
2020-07-01 08:01:03,497 - INFO - allennlp.common.params - Rank 0 | data_loader.shuffle = False
2020-07-01 08:01:03,497 - INFO - allennlp.common.params - Rank 0 | data_loader.sampler = None
2020-07-01 08:01:03,497 - INFO - allennlp.common.params - Rank 0 | data_loader.batch_sampler = None
2020-07-01 08:01:03,497 - INFO - allennlp.common.params - Rank 0 | data_loader.num_workers = 2
2020-07-01 08:01:03,497 - INFO - allennlp.common.params - Rank 0 | data_loader.pin_memory = False
2020-07-01 08:01:03,497 - INFO - allennlp.common.params - Rank 0 | data_loader.drop_last = False
2020-07-01 08:01:03,498 - INFO - allennlp.common.params - Rank 0 | data_loader.timeout = 0
2020-07-01 08:01:03,498 - INFO - allennlp.common.params - Rank 0 | data_loader.worker_init_fn = None
2020-07-01 08:01:03,498 - INFO - allennlp.common.params - Rank 0 | data_loader.multiprocessing_context = None
2020-07-01 08:01:03,498 - INFO - allennlp.common.params - Rank 0 | data_loader.batches_per_epoch = None
/home/sunxin/anaconda3/envs/allennlp1.0/lib/python3.7/site-packages/allennlp/data/dataloader.py:74: UserWarning: Using multi-pro
cess data loading with a lazy dataset could lead to deadlocks with certain tokenizers. See:
https://github.com/allenai/allennlp/issues/4330
UserWarning,
2020-07-01 08:01:04,068 - INFO - transformers.file_utils - PyTorch version 1.5.1 available.
2020-07-01 08:01:04,258 - INFO - transformers.file_utils - PyTorch version 1.5.1 available.
Traceback (most recent call last):
File "
For my own experiments (not the simple example of reproduction above), the outputs are as follow:
```
2020-07-02 01:48:37,802 - INFO - allennlp.common.params - Rank 0 | data_loader.type = default
2020-07-02 01:48:37,803 - INFO - allennlp.common.params - Rank 0 | data_loader.batch_size = 64
2020-07-02 01:48:37,804 - INFO - allennlp.common.params - Rank 0 | data_loader.shuffle = False
2020-07-02 01:48:37,804 - INFO - allennlp.common.params - Rank 0 | data_loader.sampler = None
2020-07-02 01:48:37,804 - INFO - allennlp.common.params - Rank 0 | data_loader.batch_sampler = None
2020-07-02 01:48:37,804 - INFO - allennlp.common.params - Rank 0 | data_loader.num_workers = 1
2020-07-02 01:48:37,804 - INFO - allennlp.common.params - Rank 0 | data_loader.pin_memory = True
2020-07-02 01:48:37,804 - INFO - allennlp.common.params - Rank 0 | data_loader.drop_last = False
2020-07-02 01:48:37,804 - INFO - allennlp.common.params - Rank 0 | data_loader.timeout = 0
2020-07-02 01:48:37,804 - INFO - allennlp.common.params - Rank 0 | data_loader.worker_init_fn = None
2020-07-02 01:48:37,804 - INFO - allennlp.common.params - Rank 0 | data_loader.multiprocessing_context = None
2020-07-02 01:48:37,804 - INFO - allennlp.common.params - Rank 0 | data_loader.batches_per_epoch = None
2020-07-02 01:48:37,841 - INFO - my_package.train_model - Rank 0 | Construct trainer
2020-07-02 01:48:37,842 - INFO - allennlp.common.params - Rank 0 | trainer.type = gradient_descent
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.patience = None
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.validation_metric = -loss
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.num_epochs = 20
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.cuda_device = 0
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.grad_norm = None
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.grad_clipping = None
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.distributed = True
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.world_size = 2
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.num_gradient_accumulation_steps = 1
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.opt_level = None
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.no_grad = None
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.grad_clipping = None
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.distributed = True
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.world_size = 2
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.num_gradient_accumulation_steps = 1
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.opt_level = None
2020-07-02 01:48:37,843 - INFO - allennlp.common.params - Rank 0 | trainer.no_grad = None
2020-07-02 01:48:37,844 - INFO - allennlp.common.params - Rank 0 | trainer.learning_rate_scheduler = None
2020-07-02 01:48:37,844 - INFO - allennlp.common.params - Rank 0 | trainer.momentum_scheduler = None
2020-07-02 01:48:37,844 - INFO - allennlp.common.params - Rank 0 | trainer.tensorboard_writer = None
2020-07-02 01:48:37,844 - INFO - allennlp.common.params - Rank 0 | trainer.moving_average = None
2020-07-02 01:48:37,844 - INFO - allennlp.common.params - Rank 0 | trainer.checkpointer = None
2020-07-02 01:48:37,844 - INFO - allennlp.common.params - Rank 0 | trainer.batch_callbacks = None
2020-07-02 01:48:37,844 - INFO - allennlp.common.params - Rank 0 | trainer.epoch_callbacks = None
2020-07-02 01:48:38,686 - INFO - transformers.file_utils - PyTorch version 1.5.1 available.
2020-07-02 01:48:38,703 - INFO - transformers.file_utils - PyTorch version 1.5.1 available.
2020-07-02 01:48:38,729 - INFO - transformers.file_utils - PyTorch version 1.5.1 available.
2020-07-02 01:48:38,730 - INFO - transformers.file_utils - PyTorch version 1.5.1 available.
Traceback (most recent call last):
File "
when revising num_workers=0, there are no errors.
I suspect the problem is in this code(train.py - GradientDescentTrainer - from_partial_objects):
if cuda_device >= 0:
# Moving model to GPU here so that the optimizer state gets constructed on
# the right device.
model = model.cuda(cuda_device)
since I wrap them with 'try except', it can catch an exception.
I'm still 100% sure what is causing the issue, but I've narrowed it down and found a work-around.
The work-around is to ensure your custom modules (like my_package
or my_text_classifier
in these examples) are in your Python path. There are two ways to do this:
PYTHONPATH
environment variable. If you're running allennlp
from the same directory as your custom modules, just do this: PYTHONPATH=./ allennlp train ...
.setup.py
file. Once you have that, you can just install each of your custom modules as "editable", with pip install -e .
, or python setup.py develop
.The underlying issue has something to do with the interaction between multiprocessing and pickle, since multiprocessing uses pickle to spawn and communicate across processes.
In particular, when the DataLoader
spawns loader workers, it pickle-dumps the DatasetReader
implemented in my_text_classifier
so that the spawned workers can each pickle-load the DatasetReader
. And this is where it's failing: the workers try to load this DatasetReader
object, but they can't find the my_text_classifier
module.
What's weird is that pickle should be able to find the my_text_classifier
module since it's in the current working directory, and usually it can. In fact, like you said, it finds it just fine when you're not using distributed training. But, for some reason, within a data loader worker within a distributed training worker, pickle is failing to resolve the my_text_classifier
module.
For anyone interested in pursuing this further, here's a stand-alone reproducible example: https://github.com/epwalsh/allennlp-issue-4428/tree/master.
@epwalsh Thanks for your detailed answer!!
This issue is still valid and keeps happening. Any plans to fix it? I can confirm that PYTHONPATH=. fixes.
Checklist
master
branch of AllenNLP.pip freeze
.Description
When num_workers>0, DistributeDataParallel and lazy read, allennlp report ModuleNotFoundError. ('my_text_classifier' is include-package)
Python traceback:
``` Traceback (most recent call last): File "", line 1, in
File "/home/xxx/anaconda3/envs/allennlp1.0/lib/python3.7/multiprocessing/spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "/home/xxx/anaconda3/envs/allennlp1.0/lib/python3.7/multiprocessing/spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
ModuleNotFoundError: No module named 'my_text_classifier'
Traceback (most recent call last):
File "", line 1, in
File "/home/xxx/anaconda3/envs/allennlp1.0/lib/python3.7/multiprocessing/spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "/home/xxx/anaconda3/envs/allennlp1.0/lib/python3.7/multiprocessing/spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
ModuleNotFoundError: No module named 'my_text_classifier'
```
Related issues or possible duplicates
3924
(ModuleNotFoundError also mentioned in the issue but the solution does not work)
Environment
OS: Linux
Python version: 3.7.0
Output of
pip freeze
:``` rsa==3.4.2 s3transfer==0.3.3 sacremoses==0.0.43 scikit-learn==0.20.0 scipy==1.5.0 Send2Trash==1.5.0 sentencepiece==0.1.91 six==1.15.0 spacy==2.1.9 sqlparse==0.3.1 srsly==1.0.2 tensorboardX==2.0 terminado==0.8.3 testpath==0.4.4 thinc==7.0.8 tokenizers==0.7.0 torch==1.5.1 tqdm==4.47.0 transformers==2.11.0 Unidecode==1.1.1 urllib3==1.25.9 wasabi==0.6.0 wcwidth==0.2.5 webencodings==0.5.1 widgetsnbextension==3.5.1 word2number==1.1 zipp==3.1.0 ```
Steps to reproduce
when remove --include-package and turn into .allennlp_plugins (with one line: my_text_classifier) also same error
remove any of num_workers>0 & DistributeDataParallel & lazy read, the project is ok.
Example source:
``` allennlp train my_text_classifier.jsonnet -s demo --include-package my_text_classifier (also error when use .allennlp_plugins) my_text_classifier.jsonnet: { "dataset_reader" : { "type": "classification-tsv", "token_indexers": { "tokens": { "type": "single_id" } }, "lazy": true }, "train_data_path": "data/movie_review/train.tsv", "validation_data_path": "data/movie_review/dev.tsv", "model": { "type": "simple_classifier", "embedder": { "token_embedders": { "tokens": { "type": "embedding", "embedding_dim": 10 } } }, "encoder": { "type": "bag_of_embeddings", "embedding_dim": 10 } }, "data_loader": { "batch_size": 8, "num_workers": 2 }, "trainer": { "optimizer": "adam", "num_epochs": 5 }, "distributed":{ "cuda_devices": [0,2], "num_nodes": 1 } } ```