sublime09 / CrySmile

Emoji analysis on Twitter
MIT License
2 stars 0 forks source link

Check if Tweet2Vec is better than TFIDF matrix for embeddings. #10

Open sublime09 opened 6 years ago

sublime09 commented 6 years ago

Problem with running tweet2vec on my home machine:

PS C:\Users\Patrick\Desktop\tweet2vec-master\tweet2vec> python encode_char.py ../misc/efcrysmilet.txt best_model/ result/
<<< Import error from Theano's lazy linker >>> 
    <<< This is handled BUT then another error: >>>
<<< Same Import error from Theano's lazy linker >>> 
    <<< This is handled again BUT then yet another error: >>>
File "C:\Users\Patrick\AppData\Local\Programs\Python\Python36\lib\site-packages\theano\gof\cmodule.py", line 2359, in compile_str
    (status, compile_stderr.replace('\n', '. ')))
 failed (return status=1): C:\Users\Patrick\AppData\Local\Theano\compiledir_Windows-10-10.0.16299-SP0-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-3.6.3-64\lazylinker_ext\mod.cpp:1:0: sorry, unimplemented: 64-bit mode not compiled in.
include <Python.h>
^

<<< The C code in the temp file while running this is:>>>
Problem occurred during compilation with the command line below:
"C:\MinGW\bin\g++.exe" -shared -g -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 
"C:\Users\Patrick\AppData\Local\Theano\compiledir_Windows-10-10.0.16299-SP0-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-3.6.3-64\lazylinker_ext\mod.cpp" -lpython36C:\Users\Patrick\AppData\Local\Theano\compiledir_Windows-10-10.0.16299-SP0-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-3.6.3-64\lazylinker_ext\mod.cpp:1:0: sorry, unimplemented: 64-bit mode not compiled in

 #include <Python.h>

 ^

I can't really see why. Here's my checks:

PS C:\Users\Patrick\Desktop\tweet2vec-master\tweet2vec> gcc --version
gcc.exe (GCC) 4.8.1
Copyright (C) 2013 Free Software Foundation, Inc.
PS C:\Users\Patrick\Desktop\tweet2vec-master\tweet2vec> g++.exe --version
g++.exe (GCC) 4.8.1
Copyright (C) 2013 Free Software Foundation, Inc.

PS C:\Users\Patrick\Desktop\tweet2vec-master\tweet2vec> pip list
Package           Version
----------------- ----------
gensim            3.4.0
Lasagne           0.1
matplotlib        2.1.0
nltk              3.2.5
numpy             1.13.3+mkl
pandas            0.22.0
pip               10.0.1
scikit-learn      0.19.1
scipy             1.0.0
sklearn           0.0
Theano            1.0.1
twython           3.6.0
 <<< others removed for brevity >>>
PS C:\Users\Patrick\Desktop\tweet2vec-master\tweet2vec> python
Python 3.6.3 (v3.6.3:2c5fed8, Oct  3 2017, 18:11:49) [MSC v.1900 64 bit (AMD64)] on win32
 <<< So AMD64 means I have 64 bit Python.  Which is exactly what I expected and wanted.  >>>

@jmpu Did you have this issue at all? I've looked it up and solutions just focus on making the c-compiler (gcc) available on your PATH env variable. Which is exactly how I did the gcc command....

sublime09 commented 6 years ago

Jiameng's comment on duplicate issue / note:

@sublime09 I feel it's more complicated to project tweets to vector space, but check out repos below(or search "tweet2vec" in the whole community):

https://github.com/bdhingra/tweet2vec https://github.com/soroushv/Tweet2Vec

And I have no idea about how to do this using NLTK & scikit-learn.