weka511 / nlp

My experiments with Natural Language Processing. I've created a few programs to try out concepts.
GNU General Public License v3.0
1 stars 0 forks source link

Loss increasing following resume #32

Closed weka511 closed 1 year ago

weka511 commented 1 year ago

I resumed: ./word2vec2.py train --resume --load train-k2 --save train-k2-a --plot train-k2-a --eta 0.001 --N 100000

loss appears to have doubled from around 0.5 to around 1.0. train-k2

weka511 commented 1 year ago

This is on even the first call after resume, before weights have been updated!

weka511 commented 1 year ago

Loaded ./data\dummy.npz eta=0.04929739848256657, loss 2.3568230232668697, k=5, width=2 C:\Users\Weka\nlp\word2vec2.py:177: UserWarning: Calculated tau 70382 exceeds N 1000 warn(f'Calculated tau {tau} exceeds N {N}') There are 216626 groups. Initial Loss=4.86660994e+00

weka511 commented 1 year ago

Trace what is happening from baseline (k-2)

weka511 commented 1 year ago

./word2vec2.py train --vocabulary gatsby.npz --examples gatsby.csv --save gatsby --plot gatsby --show --freq 100 --dimension 128 --N 50000 --eta 0.01

weka511 commented 1 year ago

Uniform doesn't appear to be very successful. uniform

weka511 commented 1 year ago

This is an artifact:

  1. The "final" error was computed last time program printed progress, so wasn't really final. This has been corrected.
  2. Sometimes program restarted with an eta that is too large.