I suspect that the earlier examples have distorted probabilities, as we have read just a few tokens.
n = 1 # Count examples
for sentence in extract_sentences(extract_tokens(read_text(file_names = docnames))):
indices = vocabulary.parse(sentence)
for word,context,y in word2vec.generate_examples([indices],tower):
examples.writerow([word,context,y])
n += 1
I suspect that the earlier examples have distorted probabilities, as we have read just a few tokens.