fani-lab / SEERa

A framework to predict the future user communities in a text streaming social network based on the users’ topics of interest.
Other
4 stars 5 forks source link

Install and Test-run SEERa #47

Closed Lillliant closed 1 year ago

Lillliant commented 1 year ago

This is an issue page to log my progress setting up and test-running seera.

Lillliant commented 1 year ago

@hosseinfani @soroush-ziaeinejad

So far, I have:

The first time I tried to install DynamicGEM, I received an error that Microsoft Visual C++ 14.0 or greater is required. I tried it again today and it worked for some reason. After installing mallet, I will come back and verify why it worked this time.

hosseinfani commented 1 year ago

@Lillliant Thank you for the update. Let us know if we could help

Lillliant commented 1 year ago

Update:

I've installed all the packages and made a few test-runs. It looks like DynamicGEM can be successfully installed whenever seera is activated. Here are some errors that happened during my test runs:

Traceback (most recent call last):
File "C:\Users\cw*\anaconda3\envs\seera\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import \*
File "C:\Users\cw*\anaconda3\envs\seera\lib\site-packages\tensorflow\python\pywrap*tensorflow_internal.py", line 28, in <module>
\_pywrap_tensorflow_internal = swig_import_helper()
File "C:\Users\cw*\anaconda3\envs\seera\lib\site-packages\tensorflow\python\pywrap*tensorflow_internal.py", line 24, in swig_import_helper
\_mod = imp.load_module('\_pywrap_tensorflow_internal', fp, pathname, description)
File "C:\Users\cw*\anaconda3\envs\seera\lib\imp.py", line 243, in load*module
return load_dynamic(name, filename, file)
File "C:\Users\cw*\anaconda3\envs\seera\lib\imp.py", line 343, in load_dynamic
return \_load(spec)
ImportError: DLL load failed: The specified module could not be found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 174, in run
main()
File "main.py", line 132, in main
from gel import GraphEmbedding as GE
File ".\gel\GraphEmbedding.py", line 5, in <module>
from dynamicgem.embedding.dynAERNN import DynAERNN
File "C:\Users\cw*\anaconda3\envs\seera\lib\site-packages\dynamicgem\embedding\dynAERNN.py", line 17, in <module>
from keras.layers import Input, Dense, Lambda, merge, Subtract
File "C:\Users\cw*\anaconda3\envs\seera\lib\site-packages\keras\__init\_\_.py", line 3, in <module>
from . import utils
File "C:\Users\cw_\anaconda3\envs\seera\lib\site-packages\keras\utils\__init\_\_.py", line 6, in <module>
from . import conv_utils
File "C:\Users\cw_\anaconda3\envs\seera\lib\site-packages\keras\utils\conv*utils.py", line 9, in <module>
from .. import backend as K
File "C:\Users\cw*\anaconda3\envs\seera\lib\site-packages\keras\backend\__init\_\_.py", line 89, in <module>
from .tensorflow_backend import \*
File "C:\Users\cw_\anaconda3\envs\seera\lib\site-packages\keras\backend\tensorflow*backend.py", line 5, in <module>
import tensorflow as tf
File "C:\Users\cw*\anaconda3\envs\seera\lib\site-packages\tensorflow\__init\_\_.py", line 22, in <module>
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "C:\Users\cw_\anaconda3\envs\seera\lib\site-packages\tensorflow\python\__init\_\_.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "C:\Users\cw_\anaconda3\envs\seera\lib\site-packages\tensorflow\python\pywrap*tensorflow.py", line 74, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "main.py", line 129, in main
embeddings = pd.read_pickle(f'{Params.gel["path2save"]}/Embeddings.pkl')
File "C:\Users\cw*\anaconda3\envs\seera\lib\site-packages\pandas\io\pickle.py", line 169, in read*pickle
f, fh = get_handle(fp_or_buf, "rb", compression=compression, is_text=False)
File "C:\Users\cw*\anaconda3\envs\seera\lib\site-packages\pandas\io\common.py", line 499, in get_handle
f = open(path_or_buf, mode)
FileNotFoundError: [Errno 2] No such file or directory: '../output/test-run-1/lda.mallet.dynae/gel/Embeddings.pkl'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\cw*\anaconda3\envs\seera\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import \*
File "C:\Users\cw*\anaconda3\envs\seera\lib\site-packages\tensorflow\python\pywrap*tensorflow_internal.py", line 28, in <module>
\_pywrap_tensorflow_internal = swig_import_helper()
File "C:\Users\cw*\anaconda3\envs\seera\lib\site-packages\tensorflow\python\pywrap*tensorflow_internal.py", line 24, in swig_import_helper
\_mod = imp.load_module('\_pywrap_tensorflow_internal', fp, pathname, description)
File "C:\Users\cw*\anaconda3\envs\seera\lib\imp.py", line 243, in load*module
return load_dynamic(name, filename, file)
File "C:\Users\cw*\anaconda3\envs\seera\lib\imp.py", line 343, in load_dynamic
return \_load(spec)
ImportError: DLL load failed: The specified module could not be found.

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.

Based on what I read from TensorFlow's documentation, I think this error is caused by not having Microsoft Visual C++ 14.0 or greater for TensorFlow to build during the runs. I've downloaded the build tools accordingly, and I'll see if it works in the next trial run.

I also ran into this problem in the tml layer:

Traceback (most recent call last):
  File "main.py", line 103, in main
    graphs = pd.read_pickle(path)
  File "C:\Users\cw_\anaconda3\envs\seera\lib\site-packages\pandas\io\pickle.py", line 169, in read_pickle
    f, fh = get_handle(fp_or_buf, "rb", compression=compression, is_text=False)
  File "C:\Users\cw_\anaconda3\envs\seera\lib\site-packages\pandas\io\common.py", line 499, in get_handle
    f = open(path_or_buf, mode)
FileNotFoundError: [Errno 2] No such file or directory: '../output/test-run-1/gsdmm.dynaernn/uml/graphs/graphs.pkl'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 174, in run
    main()
  File "main.py", line 110, in main
    just_one=Params.tml['justOne'], binary=Params.tml['binary'], threshold=Params.tml['threshold'])
  File "C:\Users\cw_\Documents\GitHub\seera\src\uml\UserSimilarities.py", line 33, in main
    d2t = tm.doc2topics(lda_model, dictionary.doc2bow(doc.split()), threshold=threshold, just_one=just_one, binary=binary)
  File "C:\Users\cw_\Documents\GitHub\seera\src\tml\TopicModeling.py", line 163, in doc2topics
    if t_temp >= threshold:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

It might be a byproduct of the previous errors, but it looks like t_temp is an array with 2 elements inside.

hosseinfani commented 1 year ago

@hosseinfani please pull as there is a new version of code. also, have a look at #45 and #42

If not resolved, drop by lab and @soroush-ziaeinejad or I can further help

Lillliant commented 1 year ago

@hosseinfani @soroush-ziaeinejad

Update

After pulling the new changes, installing MS Visual C++, reinstalling TensorFlow and h5py, I can now run SEERa on my laptop with the toy.synthetic data.

I've made a few test runs using the different topic modelling and graph embedding methods, and some of the combinations doesn't run successfully:

The issue with gsdmm persists after re-cloning and re-setting up gsdmm.

If possible, can I come by the lab on Thursday/Friday to resolve any issues and/or discuss any next steps?

hosseinfani commented 1 year ago

@Lillliant Awesome.

You're right, there are some combinations that are not working properly. @soroush-ziaeinejad is working on them. Also, you can debug the code and see what/where is the problem (all the source is available and you can find the issue). Then try to fix it if you can.

Sure, I'll be at EH215 most of the weekdays (except mon-wed when I have classes). Also, @soroush-ziaeinejad will be here to help you more.