The perf utility reported a bottleneck in this area, where it's waiting for the
'target' variable to be finalized. By computing the next value of target early,
overall performance is increased by about 3% on a small (128MB) training file.
add next_target next to target in the variable list.
Then in the two negative sampling blocks:
...
if (d == 0) {
target = word;
label = 1;
next_random = next_random * (unsigned long long)25214903917 + 11;
next_target = table[(next_random >> 16) % table_size];
} else {
target = next_target;
if (target == 0) target = next_random % (vocab_size - 1) + 1;
next_random = next_random * (unsigned long long)25214903917 + 11;
next_target = table[(next_random >> 16) % table_size];
if (target == word) continue;
label = 0;
}
...
Original issue reported on code.google.com by chad.p...@gmail.com on 22 Jul 2015 at 6:40
Original issue reported on code.google.com by
chad.p...@gmail.com
on 22 Jul 2015 at 6:40