Hi,
When I am using a vocabulary that is larger than 2 million words (e.g., 2.2 million) the validation entropy is always nan.
However, on the exact same data if I use a slightly smaller vocabulary (1937725 words) then entropy is calculated normally. The vocabulary is being limited by rare words from the vocabulary file.
Hi, When I am using a vocabulary that is larger than 2 million words (e.g., 2.2 million) the validation entropy is always nan. However, on the exact same data if I use a slightly smaller vocabulary (1937725 words) then entropy is calculated normally. The vocabulary is being limited by rare words from the vocabulary file.
Best regards, Rafael