Closed leah1985 closed 3 years ago
Hi. The EM and DA were automatically annotated with fine-tuned models. Specially, the DA classifier may suffer from a bit domain gap due to its tuning corpus (EmpatheticDialogues) and the applied corpus (Reddit). Hence there may exist a part of cases that the classifier failed to categorize correctly.
Gotcha, thanks for the clarification!
by the way, how many data did you actually annotated, just want to get a better idea that how much effort is required to train a model like this if I am looking to create this model for another language Thank you!
You can refer to the original paper: https://aclanthology.org/2021.findings-acl.72.pdf , Section 4.2 and 4.4. I am sorry that I cannot estimate how much data is required to fine-tune a dialog model from a pretrained checkpoint or train it from scratch.
ok just went back to the paper section 4.2, for sure you won't know. sorry about my bad question, thanks for your quick response.
Hey, just wondering if this is considered as mistakes in the training data, the emotion and dialact look a bit odd...
For example,
seeker post: couldn't work or face any social situations for months because of depression and self confidence issues. decided to switch careers and chase a passion of cooking i've always had. met a girl. happy again. it definitely gets better people chin up! seeker em : joy seeker da : agreeing
response: well done! response em : admiration response da : sympathizing
Would you explain why would "well done!" be a sympathetic act? I just came across many cases like these and started to wonder if I didn't understand how to use the data or what.
Please help me understand, thank you!