emanjavacas / pie

A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.
MIT License
22 stars 10 forks source link

Train multi-tasks #43

Closed mryolanda closed 4 years ago

mryolanda commented 4 years ago

Hi,

Is-it possible to train simultaneously the lemma tagger and the POS tagger ? By playing with the suggested model, I have never succeeded.

How should we format the data to use the "POS tagging using a linear decoder and 2 auxiliary tasks" ? Should we have a column for each task (POS Case Number) ?

Thanks !

emanjavacas commented 4 years ago

Hi,

It is possible. You need to have both lemma and pos annotations in your input files. You'll need to include the two tasks in the config file. Something like this (if you want a linear decoder for lemma, if you want to decode with the seq2seq you'll have to change the lemma task to include ""level": "char", "decoder": "attentional", "context": "sentence").


  "tasks": [
    {
      "name": "pos", "target": true, "decoder": "crf"
    },
    {
      "name": "lemma"
    },

Let me know if you have any troubles!

mryolanda commented 4 years ago

Thanks for your answer, very clear ! I have tried to use this very basic config file :

{
  "modelname": "lemmatization-latin",
  "modelpath": "models",

  "input_path": "datasets/latin/aloe_train.tsv",
  "dev_path": "datasets/latin/aloe_val.tsv",
  "sep": "\t",

  "tasks": [
    {
      "name": "pos",
      "target": true,
      "decoder": "crf"
    },
    {
      "name": "lemma"
    }
  ]
}

The training crash when it tries to run the first epoch.

::: Model parameters :::

6899149/6899149 trainable/total

Starting training

Evaluation check every 64/65 batches

::: Task schedules :::

<TaskScheduler patience="1000000" factor="1" threshold="0" min_weight="0">
    <Task name="pos" steps="0" target="True" mode="max" weight="1.0" best="-inf"/>
    <Task name="lemma" steps="0" target="False" mode="max" weight="1.0" best="-inf"/>
</TaskScheduler>
<LrScheduler lr="0.001" lr_steps="0" lr_patience="2"/>

2020-02-11 13:52:58,781 : Starting epoch [1]
Traceback (most recent call last):
  File "/home/yolan/miniconda/envs/pie/bin/pie", line 8, in <module>
    sys.exit(pie_cli())
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/scripts/group.py", line 86, in train
    pie.scripts.train.run(config_path=config_path)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/scripts/train.py", line 162, in run
    scores = trainer.train_epochs(settings.epochs, devset=devset)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/trainer.py", line 341, in train_epochs
    self.train_epoch(devset, epoch)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/trainer.py", line 294, in train_epoch
    loss = self.model.loss(batch, get_batch_task(self.tasks))
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/models/model.py", line 284, in loss
    logits = decoder(outs)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/models/decoder.py", line 137, in forward
    enc_out = self.highway(enc_outs)
TypeError: 'NoneType' object is not callable

I also have an error when I change CRF to attentional. Would you have an idea ? A suggestion of a basic but efficient lemma-pos tagger ? Thanks !

PonteIneptique commented 4 years ago

Dear @mryolanda and @emanjavacas , CRF does bug in the current release. I added a fix for it in the transformer branch, I'll extract the patch and propose the fix. You can try linear ATM :) Best,

emanjavacas commented 4 years ago

Hi @PonteIneptique , thanks for pointing that out. Is the error related to that new bug, though?

PonteIneptique commented 4 years ago

It is There is an error in the model definition of crf :)

Le mar. 11 févr. 2020 à 4:30 PM, Enrique Manjavacas < notifications@github.com> a écrit :

Hi @PonteIneptique https://github.com/PonteIneptique , thanks for pointing that out. Is the error related to that new bug, though?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/emanjavacas/pie/issues/43?email_source=notifications&email_token=AAOXEZXROQ34K6XAMHGCL5TRCK77XA5CNFSM4KSR3VTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELM3EBQ#issuecomment-584692230, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOXEZV6WTBBQL4YZ7FM5U3RCK77XANCNFSM4KSR3VTA .

emanjavacas commented 4 years ago

Hey, I pushed a quickfix for the issue. CRF should work now. (It's not on pypi though, so you run pie locally for the fix to have an effect)

mryolanda commented 4 years ago

Thank you both of you ! So, with a linear decoder, it works, or any other combination, for instance, this "works" (training is running) :

{
  "modelname": "lemmatization-latin",
  "modelpath": "models",

  "input_path": "datasets/latin/aloe_train.tsv",
  "dev_path": "datasets/latin/aloe_val.tsv",
  "sep": "\t",

  "tasks": [
    {
      "name": "lemma",
      "target": true,
      "context": "sentence",
      "level": "char",
      "decoder": "attentional",
      "settings": {
        "bos": true,
        "eos": true,
        "lower": true,
        "target": "lemma"
      }
    },
    {
      "name": "pos"
    }
  ]
}

But it doesn't work in fact (very bad results, in comparison to train separately lemma tagger and pos tagger). So, I guess I have to play with the parameters and understand them well to improve results, but would you recommend to train simultaneously lemma tagger and pos tagger ?

Concerning your quick fix for CRF, in fact, in decoder.py I already had :

if self.highway is not None:
enc_outs = self.highway(enc_outs)

So I still have the same error

emanjavacas commented 4 years ago

Hi! Good to be able to help.

My first suggestion would be to train simple models. It is not necessary the case that multi-task learning is helpful (in fact, and somewhat surprisingly, morphologically related tasks haven't been shown to be helpful). You can always train separate models and combine them in the tag script. For lemmatization feel free to check our paper (linked in the README). For POS, I'd go with a crf (although, sometimes a simple linear decoder goes a long way.) Of course, you could also try different multi-task learning configurations and see if you get any improvements. That could also be worth writing about.

emanjavacas commented 4 years ago

Re NoneType exception. You shouldn't be getting that same error if you patch the code. If you can provide more context, we may be able to help.

mryolanda commented 4 years ago

1) Ok, thank you very much ! I will continue to make tests and I will let you know ! :) 2) You are right, my bad ! But I get new errors. I have tried this one :

  "tasks": [
    {
      "name": "pos", 
      "target": true,
      "decoder": "crf",
      "layer": -1
    }
  ]
}

and get :

Starting training

Evaluation check every 64/65 batches

::: Task schedules :::

<TaskScheduler patience="1000000" factor="1" threshold="0" min_weight="0">
    <Task name="pos" steps="0" target="True" mode="max" weight="1.0" best="-inf"/>
</TaskScheduler>
<LrScheduler lr="0.001" lr_steps="0" lr_patience="2"/>

2020-02-11 16:50:14,995 : Starting epoch [1]
2020-02-11 16:50:18,716 : Batch [10/65] || pos:114.571   || 5173 w/s
2020-02-11 16:50:21,291 : Batch [20/65] || pos:85.919   || 6800 w/s
2020-02-11 16:50:23,775 : Batch [30/65] || pos:81.232   || 7043 w/s
2020-02-11 16:50:26,259 : Batch [40/65] || pos:76.692   || 7046 w/s
2020-02-11 16:50:29,080 : Batch [50/65] || pos:74.422   || 6206 w/s
2020-02-11 16:50:31,949 : Batch [60/65] || pos:71.755   || 5929 w/s

Evaluating model on dev set...

15it [00:02,  6.44it/s]

::: Dev losses :::

pos: 68.473

0it [00:00, ?it/s]
Traceback (most recent call last):
  File "/home/yolan/miniconda/envs/pie/bin/pie", line 8, in <module>
    sys.exit(pie_cli())
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/scripts/group.py", line 86, in train
    pie.scripts.train.run(config_path=config_path)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/scripts/train.py", line 162, in run
    scores = trainer.train_epochs(settings.epochs, devset=devset)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/trainer.py", line 341, in train_epochs
    self.train_epoch(devset, epoch)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/trainer.py", line 325, in train_epoch
    scores = self.run_check(devset)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/trainer.py", line 265, in run_check
    summary = self.model.evaluate(devset, self.dataset)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/models/base_model.py", line 72, in evaluate
    preds = self.predict(inp, **kwargs)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/models/model.py", line 349, in predict
    hyps, _ = decoder.predict(outs, wlen)
  File "/home/yolan/miniconda/envs/pie/lib/python3.6/site-packages/pie/models/decoder.py", line 228, in predict
    for logits_b, len_b in zip(logits.t(), lengths):
RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 3D

Should I close this issue and open a new one ?

emanjavacas commented 4 years ago

Yes please, move to a new one. Could you send the commit number, sample text file and the config file as well?