stanfordnlp / GloVe

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Apache License 2.0
6.86k stars 1.51k forks source link

warnings at compile time #5

Closed hrzafer closed 8 years ago

hrzafer commented 8 years ago

I get the following warnings on ubuntu 14.04 LTS

make
mkdir -p build
gcc src/glove.c -o build/glove -lm -pthread -Ofast -march=native -funroll-loops -Wno-unused-result
src/glove.c: In function ‘glove_thread’:
src/glove.c:85:20: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
     long long id = (long long) vid;
                    ^
src/glove.c: In function ‘train_glove’:
src/glove.c:256:86: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
         for (a = 0; a < num_threads; a++) pthread_create(&pt[a], NULL, glove_thread, (void *)a);
                                                                                      ^
gcc src/shuffle.c -o build/shuffle -lm -pthread -Ofast -march=native -funroll-loops -Wno-unused-result
src/shuffle.c:30:48: warning: integer overflow in expression [-Woverflow]
 static const long LRAND_MAX = ((long) RAND_MAX + 2) * (long)RAND_MAX;
                                                ^
src/shuffle.c: In function ‘rand_long’:
src/shuffle.c:56:31: warning: integer overflow in expression [-Woverflow]
         rnd = ((long)RAND_MAX + 1) * (long)rand() + (long)rand();
                               ^
gcc src/cooccur.c -o build/cooccur -lm -pthread -Ofast -march=native -funroll-loops -Wno-unused-result
gcc src/vocab_count.c -o build/vocab_count -lm -pthread -Ofast -march=native -funroll-loops -Wno-unused-result ```
ghost commented 8 years ago

I was not able to reproduce these on my distro. The warnings all seem rather benign, but it would be good to figure out why you're getting them and I am not. What is your version of gcc? And are you running 32 bit or something funny like that?

ghost commented 8 years ago

The first 2 issues have been addressed by https://github.com/stanfordnlp/glove/pull/11.

The 3rd and 4th issues with integer overflow correspond to the helper funciton rand_long(long n), which produces numbers in [0, n-1], with n as high as 2^31, rather than the smaller [0, 250000] range supported by rand(). According to stackoverflow, there is not really a better way of doing this. Worst case will result in an improper shuffle of the corpus on systems implementing the undefined overflow behavior in a malicious way. If anyone has a better dependency-free version of rand_long(long n), please submit a pull request. For now, I'm going to mark this issue as closed though.