Thank you for the open source code, it has been very helpful! One question however: I was wondering why in both the current paper and the prob-fast-text paper, the number of negative samples is hardcoded to 1? Was it due to efficiency reasons, or is there reason to believe this is optimal for the current model? Thank you!
Hi Ben,
Thank you for the open source code, it has been very helpful! One question however: I was wondering why in both the current paper and the prob-fast-text paper, the number of negative samples is hardcoded to 1? Was it due to efficiency reasons, or is there reason to believe this is optimal for the current model? Thank you!