salesforce / decaNLP

The Natural Language Decathlon: A Multitask Challenge for NLP
BSD 3-Clause "New" or "Revised" License
2.34k stars 474 forks source link

Training on custom dataset and answer is always "o" #49

Open ashleyyy94 opened 5 years ago

ashleyyy94 commented 5 years ago

I am trying to train on a custom WOZ dataset. I have 2 jsonl files, train.jsonl and val.jsonl as suggested.

A sample line from my dataset is {"context": "no, i just need to make sure it's cheap. oh, and i need parking", "question": "What is the change in state?", "answer": "name: not mentioned, area: not mentioned, parking: yes, pricerange: cheap, stars: not mentioned, internet: not mentioned, type: hotel;"}

This should be the correct format for the dataset training tuple. The problem is when I run the training process, when it tries to run validation after 1000 iterations (--val_every 1000), the output for the answer is always "o". As shown below:

greedy: 'food: chinese, pricerange: expensive, name: not mentioned, area: centre;'

answer: 'o'

context: 'how about the centre?'

question: 'what is the change in state?'

May I know what I'm doing wrongly? The wrong answer "o" is causing all the metrics like joint goal em, nf1 and nem to be zero.

Thank you.