In Line 4, we sample a subset of the test data, where in practice we usually use exactly half of the sentences in the test data.
If I understand correctly, if we have n sentences in the test set, this means that every bootstrap resample has only 0.5 * n sentences in it. What is the intuition of using half of the sentences here?
Hello,
I was checking your notes (http://www.phontron.com/class/mtandseq2seq2018/assets/slides/mt-fall2018.chapter11.pdf) and saw the following, which seems to be applied in this codebase as well:
In Line 4, we sample a subset of the test data, where in practice we usually use exactly half of the sentences in the test data.
If I understand correctly, if we have
n
sentences in the test set, this means that every bootstrap resample has only0.5 * n
sentences in it. What is the intuition of using half of the sentences here?Thanks