Closed TommyJones closed 2 years ago
Setting this to zero caused faults in sampling probabilities. So it seems this should be set to a very small number.
My principal concern for a non-zero prior for new terms is that it messes with the prior weight. It might be better to set to a smaller number, like the lowest quantile or decile of non-zero elements...
Went with the lowest decile. It's still arbitrary, but at least it's closer to zero and any distortion on the prior weighting should be small (I hope).
ce90c9b
Proposed changed: add 0 weight for new vocabulary words. Logic is that since they don't appear in the old model, they should not have any expectation of arriving
An alternative would be to try and place them in some sort of rank order based on their frequency in the new corpus. But IMO that needs more theoretical research (which I am doing in my PhD) to be done first to have a hard assertion over a prior.