fani-lab / SEERa

A framework to predict the future user communities in a text streaming social network based on the users’ topics of interest.
Other
4 stars 5 forks source link

Installation of SEERa #42

Closed hosseinfani closed 2 years ago

hosseinfani commented 2 years ago

@Sharjeeliv This is an issue page to log your progress for set up seera. Please let us know of any concerns/questions in this regard.

hosseinfani commented 2 years ago

Hey @Sharjeeliv Hope you're doing well. Any update?

Sharjeeliv commented 2 years ago

Hi Professor, Apologies for the delay, things have been hectic with the coop term coming to an end. Initially, I had issues with TensorFlow 1.11.0 since I have python 3.9 globally. To mitigate this I used pyenv to create a local 3.6.15 installation. However, now I am getting the following error:

(venv) ❯ pip3.6 install -r requirements.txt                                                                 seera -> main
Collecting Keras==2.2.4 (from -r requirements.txt (line 3))
  Using cached https://files.pythonhosted.org/packages/5e/10/aa32dad071ce52b5502266b5c659451cfd6ffcbf14e6c8c4f16c0ff5aaab/Keras-2.2.4-py2.py3-none-any.whl
Collecting matplotlib==3.0.1 (from -r requirements.txt (line 4))
  Using cached https://files.pythonhosted.org/packages/62/81/e394906a8a15c46b56110c558c222d4d9b3735f0595e254918eca47f98cf/matplotlib-3.0.1.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/sandbox.py", line 154, in save_modules
        yield saved
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/sandbox.py", line 195, in setup_context
        yield
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/sandbox.py", line 250, in run_setup
        _execfile(setup_script, ns)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/sandbox.py", line 45, in _execfile
        exec(code, globals, locals)
      File "/var/folders/5d/7vq9v_3x3y5849555c0rfl6c0000gn/T/easy_install-4g57dokx/numpy-1.23.2/setup.py", line 39, in <module>
        from setupext import print_line, print_raw, print_message, print_status
    RuntimeError: Python version >= 3.8 required.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/private/var/folders/5d/7vq9v_3x3y5849555c0rfl6c0000gn/T/pip-install-mrt0hzw8/matplotlib/setup.py", line 242, in <module>
        cmdclass=cmdclass,
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/__init__.py", line 142, in setup
        _install_setup_requires(attrs)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/__init__.py", line 137, in _install_setup_requires
        dist.fetch_build_eggs(dist.setup_requires)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/dist.py", line 586, in fetch_build_eggs
        replace_conflicting=True,
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/pkg_resources/__init__.py", line 780, in resolve
        replace_conflicting=replace_conflicting
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1063, in best_match
        return self.obtain(req, installer)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1075, in obtain
        return installer(requirement)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/dist.py", line 653, in fetch_build_egg
        return cmd.easy_install(req)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/command/easy_install.py", line 679, in easy_install
        return self.install_item(spec, dist.location, tmpdir, deps)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/command/easy_install.py", line 705, in install_item
        dists = self.install_eggs(spec, download, tmpdir)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/command/easy_install.py", line 890, in install_eggs
        return self.build_and_install(setup_script, setup_base)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/command/easy_install.py", line 1158, in build_and_install
        self.run_setup(setup_script, setup_base, args)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/command/easy_install.py", line 1144, in run_setup
        run_setup(setup_script, args)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/sandbox.py", line 253, in run_setup
        raise
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/contextlib.py", line 99, in __exit__
        self.gen.throw(type, value, traceback)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/sandbox.py", line 195, in setup_context
        yield
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/contextlib.py", line 99, in __exit__
        self.gen.throw(type, value, traceback)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/sandbox.py", line 166, in save_modules
        saved_exc.resume()
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/sandbox.py", line 141, in resume
        six.reraise(type, exc, self._tb)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/_vendor/six.py", line 685, in reraise
        raise value.with_traceback(tb)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/sandbox.py", line 154, in save_modules
        yield saved
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/sandbox.py", line 195, in setup_context
        yield
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/sandbox.py", line 250, in run_setup
        _execfile(setup_script, ns)
      File "/Users/sharjeelmustafa/.pyenv/versions/3.6.15/lib/python3.6/site-packages/setuptools/sandbox.py", line 45, in _execfile
        exec(code, globals, locals)
      File "/var/folders/5d/7vq9v_3x3y5849555c0rfl6c0000gn/T/easy_install-4g57dokx/numpy-1.23.2/setup.py", line 39, in <module>
        from setupext import print_line, print_raw, print_message, print_status
    RuntimeError: Python version >= 3.8 required.
    ============================================================================
    Edit setup.cfg to change the build options

    BUILDING MATPLOTLIB
                matplotlib: yes [3.0.1]
                    python: yes [3.6.15 (default, Aug 23 2022, 08:44:33)  [GCC
                            Apple LLVM 13.1.6 (clang-1316.0.21.2.5)]]
                  platform: yes [darwin]

    REQUIRED DEPENDENCIES AND EXTENSIONS
                     numpy: yes [not found. pip may install it below.]
          install_requires: yes [handled by setuptools]
                    libagg: yes [pkg-config information for 'libagg' could not
                            be found. Using local copy.]
                  freetype: yes [version 2.12.1]
                       png: yes [version 1.6.37]
                     qhull: yes [pkg-config information for 'libqhull' could not
                            be found. Using local copy.]

    OPTIONAL SUBPACKAGES
               sample_data: yes [installing]
                  toolkits: yes [installing]
                     tests: no  [skipping due to configuration]
            toolkits_tests: no  [skipping due to configuration]

    OPTIONAL BACKEND EXTENSIONS
                       agg: yes [installing]
                     tkagg: yes [installing; run-time loading from Python Tcl /
                            Tk]
                    macosx: yes [installing, darwin]
                 windowing: no  [Microsoft Windows only]

    OPTIONAL PACKAGE DATA
                      dlls: no  [skipping due to configuration]

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/5d/7vq9v_3x3y5849555c0rfl6c0000gn/T/pip-install-mrt0hzw8/matplotlib/
hosseinfani commented 2 years ago

@soroush-ziaeinejad any idea? the issue is with matplotlib. I think the 3.0 version should be installed on p3.6.

@Sharjeeliv have you tried conda and environment.yml?

soroush-ziaeinejad commented 2 years ago

@hosseinfani @Sharjeeliv

Matplotlib=='3.0.1' for Windows and =='3.3.4' for linux are tested and they're compatible with tensorflow=='1.11.0' and python3.6.

The problem may be solved using conda (please try and let us know) but we have to fix the pip issue anyways.

Sharjeeliv commented 2 years ago

Ok, I will use conda and report back the results.

Sharjeeliv commented 2 years ago

When using Conda to install I got the following error. Since you mentioned in our meeting that we are not using the TensorFlow-gpu, and the pip installation only included TensorFlow, by removing tensorflow-gpu from the yml file the installation was successful.

Was this correct? Or is there another way to fix the below error?

(anaconda3) ❯ conda env create -f environment.yml                                                              seera -> main
Collecting package metadata (repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.12.0
  latest version: 4.14.0

Please update conda by running

    $ conda update -n base -c defaults conda

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Installing pip dependencies: \ Ran pip subprocess with arguments:
['/Users/sharjeelmustafa/opt/anaconda3/envs/seera/bin/python', '-m', 'pip', 'install', '-U', '-r', '/Users/sharjeelmustafa/Documents/02 Work/01 Research/Y3-2022-F/SEERa/seera/condaenv.8k4xm5bz.requirements.txt']
Pip subprocess output:
Collecting h5py==2.8.0
  Using cached h5py-2.8.0-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (6.0 MB)
Collecting Cython>=0.29
  Using cached Cython-0.29.32-py2.py3-none-any.whl (986 kB)
Collecting Keras==2.2.4
  Using cached Keras-2.2.4-py2.py3-none-any.whl (312 kB)
Collecting matplotlib==3.0.1
  Using cached matplotlib-3.0.1-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (14.1 MB)

Pip subprocess error:
ERROR: Could not find a version that satisfies the requirement tensorflow-gpu==1.11.0 (from versions: 0.12.1, 1.0.0, 1.0.1, 1.1.0)
ERROR: No matching distribution found for tensorflow-gpu==1.11.0

failed

CondaEnvException: Pip failed
hosseinfani commented 2 years ago

@Sharjeeliv yep. that's correct. seera is not gpu ready yet.

Ok, now you can trace lines of code when run on the toy data and figure out the story of each layer.

hosseinfani commented 2 years ago

Hi @Sharjeeliv Any update on your progress?

Sharjeeliv commented 2 years ago

Hello Sir,

I've been reading the code to try to understand what's going on. However, when I try to run it I get the following error. I thought the malletHome param might cause the issue since it's a hardcoded path, but changing it doesn't seem to fix the issue (and it also changes back into the original path??).

I'm in the process of reading the documentation for the packages.

Running pipeline for lda and dynaernn ....

1. DAL: Temporal Document Creation from Social Posts ...
##################################################
1.1. Loading saved temporal documents from  ../output/toy/lda.dynaernn/Documents.csv in which 
(User, Time) a document is concat of user's posts in each 1 day(s)...
(#ProcessedDocuments, #Documents, #Users, #TimeIntervals): (180,180,60,3)
Time Elapsed: 0.008581161499023438

2. TML: Topic Modeling ...
##################################################
2.1. Loading saved topic model of lda from ../output/toy/lda.dynaernn/tml/3TopicsDictionary.mm and ../output/toy/lda.dynaernn/tml/3Topics.model ...
loading Dictionary object from ../output/toy/lda.dynaernn/tml/3TopicsDictionary.mm
{'transport_params': None, 'compression': 'infer_from_extension', 'opener': None, 'closefd': True, 'newline': None, 'errors': None, 'encoding': None, 'buffering': -1, 'mode': 'rb', 'uri': '../output/toy/lda.dynaernn/tml/3TopicsDictionary.mm'}
loaded ../output/toy/lda.dynaernn/tml/3TopicsDictionary.mm
loading LdaModel object from ../output/toy/lda.dynaernn/tml/3Topics.model
{'transport_params': None, 'compression': 'infer_from_extension', 'opener': None, 'closefd': True, 'newline': None, 'errors': None, 'encoding': None, 'buffering': -1, 'mode': 'rb', 'uri': '../output/toy/lda.dynaernn/tml/3Topics.model'}
2.1. Loading saved topic model failed! Training a model ...
(#Topics, Model): (3, lda)
adding document #0 to Dictionary(0 unique tokens: [])
built Dictionary(29 unique tokens: ['apple', 'dell', 'digital', 'keyboard', 'microsoft']...) from 180 documents (total 5832 corpus positions)
saving Dictionary object under ../output/toy/lda.dynaernn/tml/3TopicsDictionary.mm, separately None
{'transport_params': None, 'compression': 'infer_from_extension', 'opener': None, 'closefd': True, 'newline': None, 'errors': None, 'encoding': None, 'buffering': -1, 'mode': 'wb', 'uri': '../output/toy/lda.dynaernn/tml/3TopicsDictionary.mm'}
saved ../output/toy/lda.dynaernn/tml/3TopicsDictionary.mm
discarding 0 tokens: []...
keeping 29 tokens which were in no less than 2 and no more than 108 (=60.0%) documents
rebuilding dictionary, shrinking gaps
resulting dictionary: Dictionary(29 unique tokens: ['apple', 'dell', 'digital', 'keyboard', 'microsoft']...)
Traceback (most recent call last):
  File "main.py", line 63, in main
    lda_model = gensim.models.LdaModel.load(path_mdl)
  File "/Users/sharjeelmustafa/opt/anaconda3/envs/seera/lib/python3.6/site-packages/gensim/models/ldamodel.py", line 1638, in load
    result = super(LdaModel, cls).load(fname, *args, **kwargs)
  File "/Users/sharjeelmustafa/opt/anaconda3/envs/seera/lib/python3.6/site-packages/gensim/utils.py", line 435, in load
    obj = unpickle(fname)
  File "/Users/sharjeelmustafa/opt/anaconda3/envs/seera/lib/python3.6/site-packages/gensim/utils.py", line 1395, in unpickle
    with open(fname, 'rb') as f:
  File "/Users/sharjeelmustafa/opt/anaconda3/envs/seera/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 184, in open
    newline=newline,
  File "/Users/sharjeelmustafa/opt/anaconda3/envs/seera/lib/python3.6/site-packages/smart_open/smart_open_lib.py", line 363, in _shortcut_open
    return _builtin_open(local_path, mode, buffering=buffering, **open_kwargs)
FileNotFoundError: [Errno 2] No such file or directory: '../output/toy/lda.dynaernn/tml/3Topics.model'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 151, in run
    main()
  File "main.py", line 73, in main
    path_2_save_tml=Params.tml['path2save'])
  File "/Users/sharjeelmustafa/Documents/02 Work/01 Research/Y3-2022-F/SEERa/seera/src/tml/TopicModeling.py", line 98, in topic_modeling
    raise ValueError("Invalid topic modeling!")
ValueError: Invalid topic modeling!
hosseinfani commented 2 years ago

Hi @Sharjeeliv Sorry for my late reply. Please expedite this step as we need you to embark on coding very soon.

Sharjeeliv commented 2 years ago

Ok, I will try to finish this weekend. If possible, I would like to see how you or Soroush have it set up and work on it (i.e. PyCharm).

hosseinfani commented 2 years ago

@soroush-ziaeinejad would you please set up a quick meeting with @Sharjeeliv for the installation? Thank you.

Sharjeeliv commented 2 years ago

Is it okay if I drop by your office for things related to SEERa, or should I keep everything here

hosseinfani commented 2 years ago

@Sharjeeliv For sure. I'm at our lab now (215 essex hall).

Sharjeeliv commented 2 years ago

Hi, Professor @hosseinfani

Could you please post the SEERa project on the Outstanding Scholars Sharepoint? I need to complete the form before Wed Sep 28.

Progress:

hosseinfani commented 2 years ago

@Sharjeeliv let me know the link for the sharepoint please.

Sharjeeliv commented 2 years ago

https://uwin365.sharepoint.com/sites/OSRECORDS/Lists/Project%20Proposal%20Admin/AllItems.aspx The outstanding student council said to contact Dr. Tim Brunet to make a posting if you can't access it already.

hosseinfani commented 2 years ago

@Sharjeeliv done!

Sharjeeliv commented 2 years ago

Update:

farinamhz commented 2 years ago

I ran into the same issue with this description: "ValueError: Invalid topic modeling!" Solution: The problem was with the run command in the readme.

First, the run command was:

python -u main.py -r toy -t LDA -g AE DynAE DynAERNN

However, our options for the tml method are: 'lda.gensim', 'lda.mallet', and 'gsdmm'

So, the run command will be something like this:

python -u main.py -r toy -t lda.gensim gsdmm -g AE DynAE DynAERNN

I fixed it in the readme file.

hosseinfani commented 2 years ago

@Sharjeeliv I believe you could successfully run seera. I close this issue then.