cpllab / lm-zoo

Easy black-box access to state-of-the-art language models
https://cpllab.github.io/lm-zoo/
MIT License
14 stars 6 forks source link

add dialoGPT-medium #65

Open AnneBeyer opened 3 years ago

AnneBeyer commented 3 years ago

Model

DialoGPT-medium extends GPT-2-medium by fine-tuning on Reddit data in order to model dialogue. For this, the eos token is used to mark a speaker change (represented by the [SEP] token in the input, which requires some modifications to get_surprisals.py and tokenizer.py).

@inproceedings{zhang-etal-2020-dialogpt,
    title = "{DIALOGPT} : Large-Scale Generative Pre-training for Conversational Response Generation",
    author = "Zhang, Yizhe  and
      Sun, Siqi  and
      Galley, Michel  and
      Chen, Yen-Chun  and
      Brockett, Chris  and
      Gao, Xiang  and
      Gao, Jianfeng  and
      Liu, Jingjing  and
      Dolan, Bill",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations",
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-demos.30",
    doi = "10.18653/v1/2020.acl-demos.30",
    pages = "270--278"
}

Training

Licensing

MIT License