stanfordnlp / GloVe

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Apache License 2.0
6.81k stars 1.51k forks source link

Different answer each time? #207

Closed kwalcock closed 2 years ago

kwalcock commented 2 years ago

Please excuse the naïve question, but should glove.c be creating a different collection of vectors on two runs with the same input even if the random seed also stays the same? It will do this if the thread count is not 1. If that's expected behavior, I'm going to have to wait out that single thread in order to get a repeatable result and I'm impatient. Thanks for any insight anyone can provide.

AngledLuffa commented 2 years ago

Yes, a multithreaded process may have the results come in at slightly different times, leading to math operations being applied in different orders, leading to small differences which spiral out of control until what originally looked like a tiny disagreement ending up with a fight lasting for hours with no one willing to back down over something stupid

On Tue, Jul 12, 2022 at 10:04 PM Keith Alcock @.***> wrote:

Please excuse the naïve question, but should glove.c be creating a different collection of vectors on two runs with the same input even if the random seed also stays the same? It will do this if the thread count is not

  1. If that's expected behavior, I'm going to have to wait out that single thread in order to get a repeatable result and I'm impatient. Thanks for any insight anyone can provide.

— Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/GloVe/issues/207, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWLFFAWRG5PUFZEVZMLVTZE7DANCNFSM53NLJXFA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

kwalcock commented 2 years ago

That is explained so eloquently that it makes up for the disappointment. Thanks for confirming. I'll just have to wait out the single thread.