bootstrap sample size - Githubissues

Hello,

I was checking your notes (http://www.phontron.com/class/mtandseq2seq2018/assets/slides/mt-fall2018.chapter11.pdf) and saw the following, which seems to be applied in this codebase as well:

In Line 4, we sample a subset of the test data, where in practice we usually use exactly half of the sentences in the test data.

If I understand correctly, if we have n sentences in the test set, this means that every bootstrap resample has only 0.5 * n sentences in it. What is the intuition of using half of the sentences here?

Thanks

neulab / compare-mt

bootstrap sample size #124