SimuJenni / word2vec

Repo for R&D research project
0 stars 0 forks source link

Build for Mac? #1

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
On a Mac:
1. svn checkout http://word2vec.googlecode.com/svn/trunk/
2. make

What is the expected output?
Binary is emitted.

What do you see instead?
pindari:word2vec pmonks$ make
gcc word2vec.c -o word2vec -lm -pthread -Ofast -march=native -Wall 
-funroll-loops -Wno-unused-result
cc1: error: invalid option argument ‘-Ofast’
cc1: error: unrecognized command line option "-Wno-unused-result"
word2vec.c:1: error: bad value (native) for -march= switch
word2vec.c:1: error: bad value (native) for -mtune= switch
make: *** [word2vec] Error 1
pindari:word2vec pmonks$

What version of the product are you using?
SVN r32

On what operating system?
Mac OSX 10.8.4

Original issue reported on code.google.com by peter.mo...@alfresco.com on 15 Aug 2013 at 5:45

GoogleCodeExporter commented 8 years ago
Updating gcc will fix this issue: e.g., 
http://superuser.com/questions/517218/how-do-i-install-gcc-4-7-2-on-os-x-10-8. 
(You'll probably have other issues after that, though. I still can't get this 
to work on OS X.)

Original comment by jesse.cz...@gmail.com on 15 Aug 2013 at 6:34

GoogleCodeExporter commented 8 years ago
Got it to work with the following steps:

1) Update gcc to 4.7: 
http://superuser.com/questions/517218/how-do-i-install-gcc-4-7-2-on-os-x-10-8
2) Change "-march=native" to "-msse4.2" in makefile
3) Add "-I/usr/include/sys" to makefile "CFLAGS = " statement

Original comment by jesse.cz...@gmail.com on 15 Aug 2013 at 7:22

GoogleCodeExporter commented 8 years ago
It compiles if you remove the -Ofast, -Wno-unused-result and -march gcc 
options, and replace malloc.h with stdlib.h in the include statements. There 
might be a better way, though.

Original comment by eaton....@gmail.com on 15 Aug 2013 at 8:01

GoogleCodeExporter commented 8 years ago
Thanks eaton...@gmail.com - that appears to have worked (binaries run, at least 
when not provided with arguments).

Original comment by peter.mo...@alfresco.com on 15 Aug 2013 at 8:15

GoogleCodeExporter commented 8 years ago
This is my modified build for mac. It worked with 8text.zip (I suggest manually 
downloading/extracting it. since the script uses wget to download and it cannot 
find it on mac.)

Original comment by akshayub...@gmail.com on 17 Aug 2013 at 5:57

Attachments:

GoogleCodeExporter commented 8 years ago
./distance in this mac package works with the bin generated from text8, but not 
with the freebase bin file. Just me or everyone?

Original comment by libins...@gmail.com on 17 Aug 2013 at 1:38

GoogleCodeExporter commented 8 years ago
A slightly better way to go about it. If you replace gcc with clang, which is 
what osx is sticking to now, then you just switch -Ofast with -O2 and 
-Who-unused-result with -Wunused-result.

Original comment by dluna...@gmail.com on 18 Aug 2013 at 11:20

GoogleCodeExporter commented 8 years ago
I had to do the following to get the demos to work on my 10.8.2 Hackintosh:

* in the makefile:
    * replace 'gcc' with 'clang'
    * replace '-Ofast' with '-O2'
    * replace '-Who-unused-result' with '-Wunused-result'

* where needed in the *.c files, replace '#include <malloc.h>' with '#include 
<stdlib.h>'

* intall 'wget' (I used the instructions at 
http://osxdaily.com/2012/05/22/install-wget-mac-os-x/)

If the files text8 and text8-phrase do not appear after running one of the 
scripts, you can download them from http://mattmahoney.net/dc/text8.zip.

This looks like really cool technology!

Original comment by GreggInCA@gmail.com on 22 Aug 2013 at 2:55

GoogleCodeExporter commented 8 years ago
Instead of getting or building wget, why not use curl.
Replace in for example demo-word.sh the wget for:
curl -o text8.gz http://mattmahoney.net/dc/text8.zip

Original comment by e...@vanstegeren.com on 27 Aug 2013 at 1:08

GoogleCodeExporter commented 8 years ago
CFLAGS = -lm -lc -pthread -O2 -msse4.2 -Wall -funroll-loops -Wunused-result

and replaced or removed all (where already present):
#include <malloc.h>
with:
#include <stdlib.h> 

Original comment by florian.leitner on 18 Nov 2013 at 2:27

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
after having compiled on mavericks (simple malloc.h substitution to stdlib.h 
and nothing changed in compiler parameters) word2vec works well with 
demo-word.sh and demo-phrases.sh, but not with demo-word-accuracy. I get a 
segfault at line 7, sunning only line 7 (as i already have vectors.bin used in 
demo-word.sh) i get:
./compute-accuracy vectors.bin 30000 < questions-words.txt
capital-common-countries:
Segmentation fault: 11
Any idea?

Original comment by piero.mo...@gmail.com on 20 Nov 2013 at 5:03

GoogleCodeExporter commented 8 years ago
It could be related to the non-portable call to gzip when unpacking the test 
data. In fact, if you look at the demo scripts and change the line with gzip to 
the line with unzip, the demo should run.

  #gzip -d text8.gz -f
  unzip -c -d text8.gz > text8

Regarding the malloc/stdlib error, you can add block of directives to handle 
whether __APPLE__ has been defined. Something like below should work with 
distance.c, word-analogy.c, and compute-accuracy.c:

#ifdef __APPLE__
#include <sys/malloc.h>
#include <stdlib.h>
#else
#include <malloc.h>
#endif

Best of luck!
Paul

Original comment by paulrigo...@gmail.com on 13 Dec 2013 at 10:51

GoogleCodeExporter commented 8 years ago
I had issues on OS X 10.9.2, and my fix was to install gcc 4.7 using macports

1. "sudo port install gcc47" This will install gcc as gcc-mp-47 so you will 
need to change the first line in the makefile to refer to that instead of just 
"gcc".
2. Some libraries are needed from /usr/include/sys so you have to add to the 
CFLAGS statement in the makefile "-I/usr/include/sys"
3. Unfortunately, the header file time.h in /usr/include/sys is not the one you 
want because it doesn't define clock_t type. So we have to explicitly refer to 
"#include </usr/include/time.h>" in the word2vec.c file and any others that 
declare variables of clock_t type.

Hopefully this will save other time
Eddie

Original comment by edeussil...@gmail.com on 28 Mar 2014 at 10:38