Open Cakeszy opened 4 years ago
I suggest looking at example_entry.py. The candidates are all possible replies to the prompt sentence. In train.py:
for j, candidate in enumerate(utterance["candidates"][-num_candidates:]): lm_labels = bool(j == num_candidates-1) instance = build_input_from_segments(persona, history, candidate, tokenizer, lm_labels)
The last sentence in candidate is taken as the gold reply. Everything else in candidates is taken as the distractor.
I suggest looking at example_entry.py. The candidates are all possible replies to the prompt sentence. In train.py:
Please could you share which command you're running to train on the example_entry.py?
I'm trying (without modifying example_entry.py) python ./train.py --dataset_path=example_entry.py
but i get errors like
ERROR:ignite.engine.engine.Engine:Engine run is terminating due to exception: Target -100 is out of bounds..
Sorry for the late response.
I did not really use the example_entry.py to run an example. As far as I am aware example_entry.py is just an example of the format in the JSON files.
If you want to see how all the distractors being selected I suggest you add a print statement in this code snippet from train.py:
for j, candidate in enumerate(utterance["candidates"][-num_candidates:]): lm_labels = bool(j == num_candidates-1) print(candidate) instance = build_input_from_segments(persona, history, candidate, tokenizer, lm_labels)
In my code its at line 93.
I want to use my own custom dataset with this project, but I don't understand how the distractors were made in the original dataset to get a grasp on how to do this. Are they randomly sampled from other conversations?