RowitZou / CG-nAR

EMNLP-2021 paper: Thinking Clearly, Talking Fast: Concept-Guided Non-Autoregressive Generation for Open-Domain Dialogue Systems.
MIT License
18 stars 1 forks source link

Responses generated by the model seem to have many incomplete sentences and incorrect grammar. #4

Closed 27182812 closed 2 years ago

27182812 commented 2 years ago

Hello, I find responses generated by the model seem to have many incomplete sentences and incorrect grammar. Is it the reason for non-autoregressive decoding? Can it be solved by adjusting parameters?

RowitZou commented 2 years ago

Here are two solutions to improve the generation quality:

  1. You could train insertion-TF with more sampled subsequences. See src/models/model.py line 557-558.
  2. Replace vanilla transformer with BERT. Just add two arguments to the training command: "-encoder BERT -sep_optim"
27182812 commented 2 years ago

Thank for your reply! Does it mean I need to add args.bert_dir and download the weight of Bert model. Does more sampled subsequences depond on the graph construction?

RowitZou commented 2 years ago
  1. args.bert_dir can be the official BERT path in huggingface, namely 'bert-base-uncased'. Or you can download it in the local directory.
  2. We ensure that all concepts are included in the sampled subsequences, plus some randomly selected words in the target response. The subsequence can be regarded as a prompt and the insertion-TF learns to complete the response. Giving more sampled subsequences means insertion-TF can see more states of partial responses in training time. Another solution is to make sure that the subsequence only contains the concepts themselves, so insertion-TF learns to generate responses based on concepts from scratch.
27182812 commented 2 years ago

Yes, I did. Thank you for your reply. It's very helpful!