ude-soco / RIMA

https://rima.soco.inko.cloud
9 stars 14 forks source link

Unable to load stanfordcorenlp #374

Open shoebjoarder opened 11 months ago

shoebjoarder commented 11 months ago

In the development branch, I am unable to locate the issue why the logs for the development without docker and with docker are different. Here is the log when we start the development server without docker:

...
============================
Loading ELMO Weight File...
C:\Users\shoeb\Desktop\RIMA\RIMA-Backend\model\elmo\elmo.hdf5
============================
2023-11-10 09:36:46,400 INFO [django.utils.autoreload:586] autoreload 4064 11140 Watching for file changes with StatReloader
Performing system checks...

2023-11-10 09:36:47.050463: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2023-11-10 09:36:47.050786: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-11-10 09:36:52,180 INFO [pytorch_pretrained_bert.modeling:230] modeling 4064 11140 Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex .
2023-11-10 09:36:52,195 INFO [pytorch_transformers.modeling_bert:226] modeling_bert 4064 11140 Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex .
2023-11-10 09:36:52,195 INFO [pytorch_transformers.modeling_xlnet:339] modeling_xlnet 4064 11140 Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex .
2023-11-10 09:36:52,418 INFO [allennlp.common.registrable:73] registrable 4064 11140 instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'>
2023-11-10 09:36:52,418 INFO [allennlp.common.registrable:73] registrable 4064 11140 instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'>
2023-11-10 09:36:52,418 INFO [allennlp.common.registrable:73] registrable 4064 11140 instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'>
2023-11-10 09:36:52,418 INFO [allennlp.common.registrable:73] registrable 4064 11140 instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'>
2023-11-10 09:36:52,551 INFO [allennlp.commands.elmo:174] elmo 4064 11140 Initializing ELMo.
2023-11-10 09:36:58,461 INFO [root:88] corenlp 4064 11140 Initializing native server...
2023-11-10 09:36:58,461 INFO [root:98] corenlp 4064 11140 java -Xmx4g -cp "C:\Users\shoeb\Desktop\RIMA\RIMA-Backend\model\stanford-corenlp\*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9002
2023-11-10 09:36:58,463 INFO [root:107] corenlp 4064 11140 Server shell PID: 7676
2023-11-10 09:36:59,468 INFO [root:118] corenlp 4064 11140 The server is available.
2023-11-10 09:36:59,623 INFO [sentence_transformers.SentenceTransformer:66] SentenceTransformer 4064 11140 Load pretrained SentenceTransformer: C:\Users\shoeb\Desktop\RIMA\RIMA-Backend\model\msmarco\
2023-11-10 09:37:00,494 INFO [sentence_transformers.SentenceTransformer:105] SentenceTransformer 4064 11140 Use pytorch device: cpu
r59YidzDfn5aVNZMOtaV36Wd3zGJsMIs8dKd12sA
System check identified no issues (0 silenced).
November 10, 2023 - 09:37:01
Django version 2.2.3, using settings 'interest_miner_api.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CTRL-BREAK.

We can clearly see that the stanfordcorenlp is recognized and running in port 9002 corenlp 4064 11140 java -Xmx4g -cp "C:\Users\shoeb\Desktop\RIMA\RIMA-Backend\model\stanford-corenlp\*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9002

The logs for the rima-backend-worker-1 at least provides information regarding the Tensorflow not able to find GPU on the machine, but the rest of the logs regarding the pytorch, allennlp, and stanfordcorenlp not being shown when the server is started.

...
2023-11-12 11:06:34 ============================
2023-11-12 11:06:34 Loading ELMO Weight File...
2023-11-12 11:06:34 /home/app/.model/elmo/elmo.hdf5
2023-11-12 11:06:34 ============================
2023-11-12 11:06:42 coreNLP:  <interests.Keyword_Extractor.Algorithms.embedding_based.sifrank.taggers.stanford_core_nlp_tagger.StanfordCoreNLPTagger object at 0x7f60e8d74990>
2023-11-12 11:06:43 None
2023-11-12 11:06:44  
2023-11-12 11:06:44  -------------- celery@320592d88068 v4.3.0 (rhubarb)
2023-11-12 11:06:44 ---- **** ----- 
2023-11-12 11:06:44 --- * ***  * -- Linux-6.4.16-linuxkit-x86_64-with-debian-12.1 2023-11-12 10:06:44
2023-11-12 11:06:44 -- * - **** --- 
2023-11-12 11:06:44 - ** ---------- [config]
2023-11-12 11:06:44 - ** ---------- .> app:         interest_miner_api:0x7f612b8cad50
2023-11-12 11:06:44 - ** ---------- .> transport:   redis://backend-redis:6379//
2023-11-12 11:06:44 - ** ---------- .> results:     redis://backend-redis:6379/
2023-11-12 11:06:44 - *** --- * --- .> concurrency: 1 (prefork)
2023-11-12 11:06:44 -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
2023-11-12 11:06:44 --- ***** ----- 
2023-11-12 11:06:44  -------------- [queues]
2023-11-12 11:06:44                 .> celery           exchange=celery(direct) key=celery
2023-11-12 11:06:44                 
2023-11-12 11:06:44 
2023-11-10 18:08:24 2023-11-10 17:08:24.861404: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2023-11-10 18:08:24 2023-11-10 17:08:24.861427: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-11-10 18:08:41 2023-11-10 17:08:41.899538: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2023-11-10 18:08:41 2023-11-10 17:08:41.899562: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-11-12 11:06:20 2023-11-12 10:06:20.775126: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2023-11-12 11:06:20 2023-11-12 10:06:20.775191: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-11-12 11:06:35 2023-11-12 10:06:35.265304: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2023-11-12 11:06:35 2023-11-12 10:06:35.265440: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

Moreover, the rima-backend-api-1 container logs is quite different, not even showing the Tensorflow issue regarding GPU

...
2023-11-12 11:06:19 ============================
2023-11-12 11:06:19 Loading ELMO Weight File...
2023-11-12 11:06:19 /home/app/.model/elmo/elmo.hdf5
2023-11-12 11:06:19 ============================
2023-11-10 18:08:24 [2023-11-10 17:08:24 +0000] [20] [INFO] Starting gunicorn 21.2.0
2023-11-10 18:08:24 [2023-11-10 17:08:24 +0000] [20] [INFO] Listening at: http://0.0.0.0:8000 (20)
2023-11-10 18:08:24 [2023-11-10 17:08:24 +0000] [20] [INFO] Using worker: sync
2023-11-10 18:08:24 [2023-11-10 17:08:24 +0000] [23] [INFO] Booting worker with pid: 23
2023-11-12 11:06:19 [2023-11-12 10:06:19 +0000] [20] [INFO] Starting gunicorn 21.2.0
2023-11-12 11:06:19 [2023-11-12 10:06:19 +0000] [20] [INFO] Listening at: http://0.0.0.0:8000 (20)
2023-11-12 11:06:19 [2023-11-12 10:06:19 +0000] [20] [INFO] Using worker: sync
2023-11-12 11:06:19 [2023-11-12 10:06:19 +0000] [23] [INFO] Booting worker with pid: 23

I need help to know why exactly these files are not being recognized by the backend libraries while running in the docker.

ralf-berger commented 11 months ago

Can't tell what commands you're running locally and what your verbosity setting is. The worker process is started with level warning:

https://github.com/ude-soco/RIMA/blob/5edd2c87f1d4567f4902323e732bf07a515f5939/RIMA-Backend/bin/worker#L11

shoebjoarder commented 11 months ago

I am using the following command locally in my Windows machine: celery -A server worker -l info -P eventlet

I changed the bin/worker file to: celery -A interest_miner_api worker -c 1 -l info The logs are showing are now this:

rima-backend-worker-1    | ============================
rima-backend-worker-1    | Loading ELMO Weight File...
rima-backend-worker-1    | /home/app/.model/elmo/elmo.hdf5
rima-backend-worker-1    | ============================
rima-backend-worker-1    | 2023-11-12 12:21:38.098223: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
rima-backend-worker-1    | 2023-11-12 12:21:38.098260: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
rima-backend-worker-1    | coreNLP:  <interests.Keyword_Extractor.Algorithms.embedding_based.sifrank.taggers.stanford_core_nlp_tagger.StanfordCoreNLPTagger object at 0x7f7591ec5d90>
rima-backend-worker-1    | None
rima-backend-worker-1    |  
rima-backend-worker-1    |  -------------- celery@1bd72ed55c0e v4.3.0 (rhubarb)
rima-backend-worker-1    | ---- **** ----- 
rima-backend-worker-1    | --- * ***  * -- Linux-6.4.16-linuxkit-x86_64-with-debian-12.1 2023-11-12 12:21:46
rima-backend-worker-1    | -- * - **** --- 
rima-backend-worker-1    | - ** ---------- [config]
rima-backend-worker-1    | - ** ---------- .> app:         interest_miner_api:0x7f75d3ba5c90
rima-backend-worker-1    | - ** ---------- .> transport:   redis://backend-redis:6379//
rima-backend-worker-1    | - ** ---------- .> results:     redis://backend-redis:6379/
rima-backend-worker-1    | - *** --- * --- .> concurrency: 1 (prefork)
rima-backend-worker-1    | -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
rima-backend-worker-1    | --- ***** ----- 
rima-backend-worker-1    |  -------------- [queues]
rima-backend-worker-1    |                 .> celery           exchange=celery(direct) key=celery
rima-backend-worker-1    |                 
rima-backend-worker-1    | 
rima-backend-worker-1    | [tasks]
rima-backend-worker-1    |   . getConnectedAuthorsData
rima-backend-worker-1    |   . getRefCitAuthorsPapers
rima-backend-worker-1    |   . import_papers
rima-backend-worker-1    |   . import_papers_for_user
rima-backend-worker-1    |   . import_tweets
rima-backend-worker-1    |   . import_tweets_for_user
rima-backend-worker-1    |   . import_user_citation_data
rima-backend-worker-1    |   . import_user_data
rima-backend-worker-1    |   . import_user_paperdata
rima-backend-worker-1    |   . import_user_papers
rima-backend-worker-1    |   . interests.publication.publication_utils.process_publication
rima-backend-worker-1    |   . manual_regenerate_long_term_model
rima-backend-worker-1    |   . regenerate_interest_profile
rima-backend-worker-1    |   . regenerate_short_term_interest_model
rima-backend-worker-1    |   . update_long_term_interest_model
rima-backend-worker-1    |   . update_long_term_interest_model_for_user
rima-backend-worker-1    |   . update_short_term_interest_model
rima-backend-worker-1    |   . update_short_term_interest_model_for_user
rima-backend-worker-1    | 
rima-backend-worker-1    | [2023-11-12 12:21:46,513: INFO/MainProcess] Connected to redis://backend-redis:6379//
rima-backend-worker-1    | [2023-11-12 12:21:46,519: INFO/MainProcess] mingle: searching for neighbors
rima-backend-worker-1    | [2023-11-12 12:21:47,531: INFO/MainProcess] mingle: all alone
rima-backend-worker-1    | [2023-11-12 12:21:47,544: INFO/MainProcess] celery@1bd72ed55c0e ready.

Still the logs doesn't show that was able to load the models downloaded from the .model folder.

I have tried to install the python packages from the Pipfile locally in my Ubuntu machine, and it's failing to install dependencies such as scikit-learn and tensorflow, reporting mismatch in dependency being installed and also not able to find scikit-learn and tensorflow versions for python 3.7. Just wondering how was the docker container able to build without any issues...

ralf-berger commented 11 months ago

Can't really help with https://github.com/ude-soco/RIMA/tree/development. gensim 3.8.3 and old numpy versions didn't work on ARM CPUs yet, so I can only build more recent versions.