Closed salman1993 closed 4 years ago
If you edit train and val data file to include 2 sentences, then data processing and training succeeds:
[
{"src": [["Hello", "world", "."], ["sent", "two"]], "labels": [0, 0]},
{"src": [["Hi", "."]], "labels": [0]}
]
@HHousen do you know why this happens?
@salman1993 I've figured this out. The problem was caused by the call to squeeze()
on line 59 here: https://github.com/HHousen/TransformerSum/blob/3921b229c1025dad1759a8bd52a8080a9659d696/src/pooling.py#L56-L62
I've removed this function call because I do not see why it needs to be there anymore.
If batch contains source text with only 1 sentence, training fails (specifically, loading examples after data processing). The issue seems to be that output size should have been (2, 1) - instead its (2, 2). Labels size is fine.
Steps to reproduce:
test_one_sent/train.0.json
file:test_one_sent/val.0.json
file:Command to run:
Error output: