alicia-ziying-yang / conTEXT-explorer

ConTEXT Explorer is an open Web-based system for exploring and visualizing concepts (combinations of occurring words and phrases) over time in the text documents.
Apache License 2.0
9 stars 3 forks source link

Can't upload new data set #14

Closed baileythegreen closed 2 years ago

baileythegreen commented 2 years ago

JOSS Reference: openjournals/joss-reviews#3347

Following the changes that have been made, I have been trying to verify the new functionality. However, I am now unable to upload a new data set, and unsure why this would be. I have even created an entirely new virtualenv to try from a clean environment, and am still unable to do so. The dashboard starts, both locally and on the server, and I can open the analysis portion for the sample data. I had also previously added a data set, which no longer appears in the list of data sets. I don't know why this is, either.

The virtualenv was created and populated this way:

! virtualenv env

created virtual environment CPython3.8.3.final.0-64 in 587ms
  creator CPython3Posix(dest=/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/Users/baileythegreen/Library/Application Support/virtualenv)
    added seed packages: pip==20.3.3, setuptools==51.3.3, wheel==0.36.2
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator

! env/bin/pip install -r requirements.txt 

Here is the file I previously managed to upload, and which I am now unable to load.

pride_chapters.csv

I am able to select the file, designate which column is which field, and hit 'upload'; it seems to register the request, but then I get this traceback:

127.0.0.1 - - [25/Sep/2021 16:56:49] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:57:55] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:57:55] "GET /assets/custom-styles.css?m=1625594956.0 HTTP/1.1" 304 -
127.0.0.1 - - [25/Sep/2021 16:57:55] "GET /assets/fonts.css?m=1625594956.0 HTTP/1.1" 304 -
127.0.0.1 - - [25/Sep/2021 16:57:55] "GET /assets/base-styles.css?m=1625594956.0 HTTP/1.1" 304 -
127.0.0.1 - - [25/Sep/2021 16:57:55] "GET /_dash-layout HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:57:55] "GET /_dash-dependencies HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:57:55] "GET /_favicon.ico?v=1.14.0 HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:57:55] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:57:55] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:57:58] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:57:58] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:58:06] "POST /API/resumable?resumableChunkNumber=1&resumableChunkSize=1048576&resumableCurrentChunkSize=4408&resumableTotalSize=4408&resumableType=text%2Fcsv&resumableIdentifier=4408-pride_chapter_1csv&resumableFilename=pride_chapter_1.csv&resumableRelativePath=pride_chapter_1.csv&resumableTotalChunks=1&upload_id=5add0ea8-1e19-11ec-b127-784f43a434f6 HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:58:06] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:58:06] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:58:06] "POST /_dash-update-component HTTP/1.1" 200 -
./whoosh_search/pride_chapter_1_index/ created.
[ Indexing Finished. In total 0 documents. ]
[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/baileythegreen/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[2021-09-25 16:58:21,743] ERROR in app: Exception on /_dash-update-component [POST]
Traceback (most recent call last):
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/flask/app.py", line 2070, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/flask/app.py", line 1515, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/flask/app.py", line 1513, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/flask/app.py", line 1499, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/dash/dash.py", line 1050, in dispatch
    response.set_data(func(*args, outputs_list=outputs_list))
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/dash/dash.py", line 985, in add_context
    output_value = func(*args, **kwargs)  # %% callback invoked %%
  File "app.py", line 2345, in uploading
    get2 = generate_models_fromapp.build_model(df,corpus_name,content_col)
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/topic_model/generate_models_fromapp.py", line 63, in build_model
    nlp = spacy.load('en_core_web_sm', disable=['parser', 'ner'])
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/spacy/__init__.py", line 30, in load
    return util.load_model(name, **overrides)
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/spacy/util.py", line 169, in load_model
    raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.
127.0.0.1 - - [25/Sep/2021 16:58:21] "POST /_dash-update-component HTTP/1.1" 500 -
[2021-09-25 16:58:29,100] ERROR in app: Exception on /_dash-update-component [POST]
Traceback (most recent call last):
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/flask/app.py", line 2070, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/flask/app.py", line 1515, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/flask/app.py", line 1513, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/flask/app.py", line 1499, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/dash/dash.py", line 1050, in dispatch
    response.set_data(func(*args, outputs_list=outputs_list))
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/dash/dash.py", line 985, in add_context
    output_value = func(*args, **kwargs)  # %% callback invoked %%
  File "app.py", line 2346, in uploading
    get3 = word2vec.train_model(corpus_name)
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/topic_model/word2vec.py", line 10, in train_model
    sentences=pd.read_pickle(processed_file_name).body.values.tolist()[0]
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/pandas/io/pickle.py", line 185, in read_pickle
    with get_handle(
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/pandas/io/common.py", line 651, in get_handle
    handle = open(handle, ioargs.mode)
FileNotFoundError: [Errno 2] No such file or directory: './topic_model/pride_chapter_1/processed_content_pride_chapter_1.pkl'
127.0.0.1 - - [25/Sep/2021 16:58:29] "POST /_dash-update-component HTTP/1.1" 500 -
127.0.0.1 - - [25/Sep/2021 16:58:32] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:58:33] "GET /assets/base-styles.css?m=1625594956.0 HTTP/1.1" 304 -
127.0.0.1 - - [25/Sep/2021 16:58:33] "GET /assets/custom-styles.css?m=1625594956.0 HTTP/1.1" 304 -
127.0.0.1 - - [25/Sep/2021 16:58:33] "GET /assets/fonts.css?m=1625594956.0 HTTP/1.1" 304 -
127.0.0.1 - - [25/Sep/2021 16:58:33] "GET /_dash-layout HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:58:33] "GET /_dash-dependencies HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:58:33] "GET /_favicon.ico?v=1.14.0 HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:58:33] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 16:58:33] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 17:01:36] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 17:01:36] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 17:01:40] "POST /API/resumable?resumableChunkNumber=1&resumableChunkSize=1048576&resumableCurrentChunkSize=29023&resumableTotalSize=29023&resumableType=text%2Fcsv&resumableIdentifier=29023-pride_chapterscsv&resumableFilename=pride_chapters.csv&resumableRelativePath=pride_chapters.csv&resumableTotalChunks=1&upload_id=dc4c7f82-1e19-11ec-b127-784f43a434f6 HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 17:01:40] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 17:01:40] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [25/Sep/2021 17:01:40] "POST /_dash-update-component HTTP/1.1" 200 -
[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/baileythegreen/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[2021-09-25 17:01:49,741] ERROR in app: Exception on /_dash-update-component [POST]
Traceback (most recent call last):
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/flask/app.py", line 2070, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/flask/app.py", line 1515, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/flask/app.py", line 1513, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/flask/app.py", line 1499, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/dash/dash.py", line 1050, in dispatch
    response.set_data(func(*args, outputs_list=outputs_list))
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/dash/dash.py", line 985, in add_context
    output_value = func(*args, **kwargs)  # %% callback invoked %%
  File "app.py", line 2345, in uploading
    get2 = generate_models_fromapp.build_model(df,corpus_name,content_col)
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/topic_model/generate_models_fromapp.py", line 63, in build_model
    nlp = spacy.load('en_core_web_sm', disable=['parser', 'ner'])
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/spacy/__init__.py", line 30, in load
    return util.load_model(name, **overrides)
  File "/Users/baileythegreen/Software/joss-reviews/conTEXT-explorer/env/lib/python3.8/site-packages/spacy/util.py", line 169, in load_model
    raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.
127.0.0.1 - - [25/Sep/2021 17:01:49] "POST /_dash-update-component HTTP/1.1" 500 -
alicia-ziying-yang commented 2 years ago

Hi @baileythegreen , The problem may be due to your python version. Please refer : https://github.com/alicia-ziying-yang/conTEXT-explorer#install-required-dependencies

The python version has to be 3.7.5. And you may need to do: python -m spacy download en after installing all packages in the requirements.txt. Could you please try these two, and see if there is still any error?

Thank you for checking!

faroit commented 2 years ago

@alicia-ziying-yang @baileythegreen i was able to install and run the software using python==3.7.5. Given similar issues could be avoided using python packaging, I opened #15.