mgrankin / ru_transformers

Apache License 2.0
776 stars 108 forks source link

References in the text #6

Closed iyeldinov closed 4 years ago

iyeldinov commented 4 years ago

Input: Захотелось выпить Output:

Захотелось выпить и понять. Э.  По: Шаг 1. Перевод Виктора Пелевина. СПб.: Азбука, 2014. С. 17.].) Гегель тоже вспоминает, что наблюдатель в положении сидя способен видеть вещи определенным образом и оставаться при этом вне их видимости.

mgrankin commented 4 years ago

That's not really a bug, since the dataset contains such references. But, I understand that it is an undesired artifact. There are several ways to deal with it. The simplest is to provide more context as an input so the model will understand this is not a reference. Alternatively you can filter generated references during the process of generation or cleanup the dataset and fine-tune the model on the clean data.