yzhangcs / parser

:rocket: State-of-the-art parsers for natural language.
https://parser.yzhang.site/
MIT License
827 stars 139 forks source link

CUDA error: device-side assert triggered #69

Closed attardi closed 3 years ago

attardi commented 3 years ago

I run into this error when training a model with encoder=bert on the Italian UD training set:

python -u -m supar.cmds.biaffine_dep train -p=exp/it_isdt.dbmdz-electra-xxl/model \ -c=config.ini --bert=dbmdz/electra-base-italian-xxl-cased-discriminator \ --train=../train-dev/UD_Italian-ISDT/it_isdt-ud-train.conllu \ --dev=../train-dev/UD_Italian-ISDT/it_isdt-ud-dev.conllu --feat=bert --encoder=bert

/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [1,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [2,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [3,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [4,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [5,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [6,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [7,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [8,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [9,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [10,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [11,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [12,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [13,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [14,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [15,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [16,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [17,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [18,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [19,0,0] Assertion t >= 0 && t < n_classes failed. Traceback (most recent call last): File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/project/piqasso/Experiment/IWPT21/parser/supar/cmds/biaffine_dep.py", line 46, in main() File "/project/piqasso/Experiment/IWPT21/parser/supar/cmds/biaffine_dep.py", line 42, in main parse(parser) File "/project/piqasso/Experiment/IWPT21/parser/supar/cmds/cmd.py", line 29, in parse parser.train(args) File "/project/piqasso/Experiment/IWPT21/parser/supar/parsers/dep.py", line 62, in train return super().train(Config().update(locals())) File "/project/piqasso/Experiment/IWPT21/parser/supar/parsers/parser.py", line 73, in train loss, test_metric = self._evaluate(test.loader) File "/usr/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/project/piqasso/Experiment/IWPT21/parser/supar/parsers/dep.py", line 197, in _evaluate arc_preds, rel_preds = self.model.decode(s_arc, s_rel, mask, self.args.tree, self.args.proj) File "/project/piqasso/Experiment/IWPT21/parser/supar/models/dep.py", line 220, in decode bad = [not CoNLL.istree(seq[1:i+1], proj) for i, seq in zip(lens.tolist(), arc_preds.tolist())] RuntimeError: CUDA error: device-side assert triggered

yzhangcs commented 3 years ago

Could you send me the training data to reproduce the error.

attardi commented 3 years ago

You can get them from here:

https://ufal.mff.cuni.cz/~zeman/soubory/iwpt2021-train-dev.tgz

Here is the config.ini:

[Data]
encoder = 'bert'
bert = 'xlm-roberta-large'

[Network]
n_bert_layers = 4
mix_dropout = .0
bert_pooling = 'mean'
encoder_dropout = .1
n_arc_mlp = 500
n_rel_mlp = 100
mlp_dropout = .33
n_embed = 0                     # no word embeddings

[Optimizer]
lr = 5e-5
lr_rate = 20
clip = 5.0
min_freq = 2
fix_len = 20
epochs = 10
warmup = 0.1
batch_size = 2000
update_steps = 5
yzhangcs commented 3 years ago

@attardi I tried the cmd

python -u -m supar.cmds.biaffine_dep train -b -d 6 -p model -c configs/ud.biaffine.dep.xlmr.ini --encoder=bert --bert=dbmdz/electra-base-italian-xxl-cased-discriminator --train data/iwpt21/UD_Italian-ISDT/it_isdt-ud-train.conllu --dev data/iwpt21/UD_Italian-ISDT/it_isdt-ud-dev.conllu --punct

and it goes well.

attardi commented 3 years ago

Isn't configs/ud.biaffine.dep.xlmr.ini the same as mine?

I still get the same error:

... File "/project/piqasso/Experiment/IWPT21/parser/supar/parsers/dep.py", line 62, in train return super().train(*Config().update(locals())) File "/project/piqasso/Experiment/IWPT21/parser/supar/parsers/parser.py", line 73, in train loss, test_metric = self._evaluate(test.loader) File "/usr/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(args, **kwargs) File "/project/piqasso/Experiment/IWPT21/parser/supar/parsers/dep.py", line 197, in _evaluate arc_preds, rel_preds = self.model.decode(s_arc, s_rel, mask, self.args.tree, self.args.proj) File "/project/piqasso/Experiment/IWPT21/parser/supar/models/dep.py", line 220, in decode bad = [not CoNLL.istree(seq[1:i+1], proj) for i, seq in zip(lens.tolist(), arc_preds.tolist())] RuntimeError: CUDA error: device-side assert triggered

attardi commented 3 years ago

The problem is the missing file --test.

attardi commented 3 years ago

I now get the error for another language, even when supplying the test file:

python -u -m supar.cmds.biaffine_dep train -d 1 -b -p exp/nl_dutch.wietsedv/model -c ud.biaffine.dep.xlmr.ini --bert=wietsedv/bert-base-dutch-cased --train ../train-dev/UD_Dutch/nl_dutch-ud-train.conllu --dev ../train-dev/UD_Dutch/nl_dutch-ud-dev.conllu --test ../train-dev/UD_Dutch/nl_dutch-ud-dev.conllu --feat=bert --encoder=bert --punct

2021-05-05 01:24:42 INFO Epoch 1 / 10: Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/attardi/IWPT21/parser.orig/supar/cmds/biaffine_dep.py", line 46, in main() File "/home/attardi/IWPT21/parser.orig/supar/cmds/biaffine_dep.py", line 42, in main parse(parser) File "/home/attardi/IWPT21/parser.orig/supar/cmds/cmd.py", line 29, in parse parser.train(args) File "/home/attardi/IWPT21/parser.orig/supar/parsers/dep.py", line 62, in train return super().train(Config().update(locals())) File "/home/attardi/IWPT21/parser.orig/supar/parsers/parser.py", line 70, in train self._train(train.loader) File "/home/attardi/IWPT21/parser.orig/supar/parsers/dep.py", line 167, in _train loss.backward() File "/home/attardi/venv/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/attardi/venv/lib/python3.8/site-packages/torch/autograd/init.py", line 145, in backward Variable._execution_engine.run_backward( RuntimeError: CUDA error: device-side assert triggered /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < n_classes failed. ...

yzhangcs commented 3 years ago

@attardi Hi, it also works for me

python -u -m supar.cmds.biaffine_dep train -d 7 -b -p model -c configs/ud.biaffine.dep.xlmr.ini --bert=wietsedv/bert-base-dutch-cased --train data/iwpt21/UD_Dutch-Alpino/nl_alpino-ud-train.conllu  --dev data/iwpt21/UD_Dutch-Alpino/nl_alpino-ud-dev.conllu  --encoder=bert --punct
MinionAttack commented 3 years ago

Hi, I'm having the same problem. It's my first time with this type of programs so I don't know if I'm doing something wrong.

My command it's:

python -m supar.cmds.biaffine_dep train -b -d 0 -c config/biaffine-ud-en-ewt.ini -p model/UD --train=/data/UD/en_ewt-ud-train.conllu --dev=/data/UD/en_ewt-ud-dev.conllu --test=/data/UD/en_ewt-ud-dev.conllu --feat=bert --encoder=bert -f char

The config file:

[Data] encoder = 'bert' bert = 'xlm-roberta-large'

[Network] n_bert_layers = 4 mix_dropout = .0 bert_pooling = 'mean' encoder_dropout = .1 n_arc_mlp = 500 n_rel_mlp = 100 mlp_dropout = .33 n_embed = 0 # no word embeddings

[Optimizer] lr = 5e-5 lr_rate = 20 clip = 5.0 min_freq = 2 fix_len = 20 epochs = 10 warmup = 0.1 batch_size = 2000 update_steps = 5

And the error I get:

2021-05-05 10:02:24 INFO Epoch 1 / 10:
0%|                  | 0/1 00:00<?, ?it/s/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [1,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [2,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [3,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [4,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [5,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [6,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [7,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [8,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [9,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [10,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [11,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [12,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [13,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [14,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [15,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [16,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [17,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [18,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [19,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [20,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [21,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [22,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [23,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [24,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [25,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [26,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [27,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [28,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [29,0,0] Assertion t >= 0 && t < n_classes failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [30,0,0] Assertion t >= 0 && t < n_classes failed.
THCudaCheck FAIL file=/pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu line=115 error=710 : device-side assert triggered
Traceback (most recent call last):         
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/iago/Escritorio/SuPar/supar/cmds/biaffine_dep.py", line 46, in <module>
    main()
  File "/home/iago/Escritorio/SuPar/supar/cmds/biaffine_dep.py", line 42, in main
    parse(parser)
  File "/home/iago/Escritorio/SuPar/supar/cmds/cmd.py", line 29, in parse
    parser.train(**args)
  File "/home/iago/Escritorio/SuPar/supar/parsers/dep.py", line 62, in train
    return super().train(**Config().update(locals()))
  File "/home/iago/Escritorio/SuPar/supar/parsers/parser.py", line 70, in train
    self._train(train.loader)
  File "/home/iago/Escritorio/SuPar/supar/parsers/dep.py", line 165, in _train
    loss = self.model.loss(s_arc, s_rel, arcs, rels, mask, self.args.partial)
  File "/home/iago/Escritorio/SuPar/supar/models/dep.py", line 194, in loss
    arc_loss = self.criterion(s_arc, arcs)
  File "/home/iago/Escritorio/SuPar/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/iago/Escritorio/SuPar/env/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 1048, in forward
    ignore_index=self.ignore_index, reduction=self.reduction)
  File "/home/iago/Escritorio/SuPar/env/lib/python3.6/site-packages/torch/nn/functional.py", line 2693, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "/home/iago/Escritorio/SuPar/env/lib/python3.6/site-packages/torch/nn/functional.py", line 2388, in nll_loss
    ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: cuda runtime error (710) : device-side assert triggered at /pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:115

Greetings.

yzhangcs commented 3 years ago

@MinionAttack Could you print all log info.

MinionAttack commented 3 years ago

@yzhangcs Hi, for a more convenient reading I've attached a txt file with all the console output.

Greetings.

console-output.txt

yzhangcs commented 3 years ago

@MinionAttack I think the problem comes from the ill datasets, since each one contains only 1 sent, as shown in your log

train: Dataset(n_sentences=1, n_batches=1, n_buckets=1)
dev:   Dataset(n_sentences=1, n_batches=1, n_buckets=1)
test:  Dataset(n_sentences=1, n_batches=1, n_buckets=1)

Please check whether the path exists

MinionAttack commented 3 years ago

@MinionAttack I think the problem comes from the ill datasets, since each one contains only 1 sent, as shown in your log

train: Dataset(n_sentences=1, n_batches=1, n_buckets=1)
dev:   Dataset(n_sentences=1, n_batches=1, n_buckets=1)
test:  Dataset(n_sentences=1, n_batches=1, n_buckets=1)

Please check whether the path exists

Hi, I forgot to specify the "--embed" (path to pretrained embeddings) parameter because in the config file I've set n_embed = 0. Do I need to specify the file even if I don't use embeddings?

EDIT:

The problem wasn't having to specify the "--embed" parameter, the problem was that the paths began with a "/", so I had:

--train=/data/UD/en_ewt-ud-train.conllu --dev=/data/UD/en_ewt-ud-dev.conllu --test=/data/UD/en_ewt-ud-dev.conllu

instead of:

--train=data/UD/en_ewt-ud-train.conllu --dev=data/UD/en_ewt-ud-dev.conllu --test=data/UD/en_ewt-ud-dev.conllu

Greetings.

yzhangcs commented 3 years ago

@MinionAttack You don't need any embeddings since --encoder bert. The model consists of only transformer layers, MLPs and Biaffines.

attardi commented 3 years ago

Here the model.log file:

2021-05-05 00:45:41 INFO ---------------------+------------------------------- Param | Value
---------------------+------------------------------- encoder | bert
bert | wietsedv/bert-base-dutch-cased n_bert_layers | 4
mix_dropout | 0.0
bert_pooling | mean
encoder_dropout | 0.1
n_arc_mlp | 500
n_rel_mlp | 100
mlp_dropout | 0.33
lr | 5e-05
lr_rate | 20
clip | 5.0
min_freq | 2
fix_len | 20
epochs | 50
warmup | 0.1
batch_size | 2000
update_steps | 5
tree | False
proj | False
partial | False
mode | train
path | exp/nl_dutch.wietsedv/model
device | 2
seed | 1
threads | 16
local_rank | -1
feat | ['bert']
build | True
punct | True
max_len | None
buckets | 32
train | ../train-dev/UD_Dutch/nl_dutch-ud-train.conllu dev | ../train-dev/UD_Dutch/nl_dutch-ud-dev.conllu test | None
embed | data/glove.6B.100d.txt
unk | unk
n_embed | 100
---------------------+-------------------------------

2021-05-05 00:45:41 INFO Building the fields 2021-05-05 00:45:43 INFO CoNLL( (words): SubwordField(pad=[PAD], unk=[UNK], bos=[CLS]) (texts): RawField() (arcs): Field(bos=, use_vocab=False) (rels): Field(bos=) ) 2021-05-05 00:45:43 INFO Building the model 2021-05-05 00:45:52 INFO BiaffineDependencyModel( (encoder): TransformerEmbedding(wietsedv/bert-base-dutch-cased, n_layers=4, n_out=768, stride=256, pooling=mean, pad_index=3, requires_grad=True) (encoder_dropout): Dropout(p=0.1, inplace=False) (arc_mlp_d): MLP(n_in=768, n_out=500, dropout=0.33) (arc_mlp_h): MLP(n_in=768, n_out=500, dropout=0.33) (rel_mlp_d): MLP(n_in=768, n_out=100, dropout=0.33) (rel_mlp_h): MLP(n_in=768, n_out=100, dropout=0.33) (arc_attn): Biaffine(n_in=500, bias_x=True) (rel_attn): Biaffine(n_in=100, n_out=2, bias_x=True, bias_y=True) (criterion): CrossEntropyLoss() )

2021-05-05 00:45:52 INFO Loading the data 2021-05-05 00:45:52 INFO train: Dataset(n_sentences=1, n_batches=1, n_buckets=1) dev: Dataset(n_sentences=718, n_batches=32, n_buckets=32)

2021-05-05 00:45:53 INFO Epoch 1 / 50:

yzhangcs commented 3 years ago

@attardi Seems that the train file is not read correctly. Please check the file path and file format.

train: Dataset(n_sentences=1, n_batches=1, n_buckets=1)
attardi commented 3 years ago

You are right: I misspelled the train file name ('-' instead of '_').