Running a Local Environment fails

nitishgupta commented 4 years ago

I ran the command MODEL=bidaf_elmo docker-compose up --build and according to the logs the demo started correctly

ui_1          | You can now view demo in the browser.
ui_1          | 
ui_1          |   Local:            http://localhost:3000

When I try to get predictions I get the "Something went wrong. Please try again" error. The Chrome console shows the error:

api/bidaf-elmo/predict:1 Failed to load resource: the server responded with a status of 404 (Not Found)
index.js:1 

Error: Predict call failed.
at Model.js:63

nitishgupta commented 4 years ago

@epwalsh , Matt said that you might be able to help.

epwalsh commented 4 years ago

Hey @nitishgupta, I would try just running the model image by itself to debug. Instructions for running an the model image by itself are here: https://github.com/allenai/allennlp-demo/tree/master/api#building

matt-gardner commented 4 years ago

Can you access the UI if you do that?

epwalsh commented 4 years ago

No, but at least you could easily see the error messages from the server (if that's where the issue is).

matt-gardner commented 4 years ago

I guess, what I don't know is how to actually run the demo during development now. Since the split of the UI and the API, there are no longer instructions for how to do local development (running both the UI and the API together, and having them talk to each other). Or are they somewhere and I just missed them?

epwalsh commented 4 years ago

Did you see step # 3 here: https://github.com/allenai/allennlp-demo#running-a-local-environment ?

nitishgupta commented 4 years ago

@epwalsh - Inside the api directory I ran make bidaf_elmo-run which seems to be running fine (output below)

Then I follow the steps in Local Development, but when I run the FLASK_ENV=development python allennlp_demo/bidaf_elmo/api.py command I get a module import error

    from allennlp_demo.common import config, http
ModuleNotFoundError: No module named 'allennlp_demo'

Output from make

docker run --rm \
        -p 8000:8000 \
        -v $HOME/.allennlp:/root/.allennlp \
        -v $HOME/.cache/torch:/root/.cache/torch \
        -v $HOME/nltk_data:/root/nltk_data \
        allennlp-demo-bidaf_elmo:latest
100%|██████████| 418130723/418130723 [00:24<00:00, 16790022.34B/s]
100%|██████████| 336/336 [00:00<00:00, 1507257.91B/s]
100%|██████████| 374434792/374434792 [00:34<00:00, 10831506.04B/s]
{"logname": "root", "severity": "INFO", "message": "Loading a model trained before embedding extension was implemented; pass an explicit vocab namespace if you want to extend the vocabulary."}
 * Serving Flask app "bidaf-elmo" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off

epwalsh commented 4 years ago

Ah, yea to fix that error you should install the allennlp_demo module locally:

cd api && pip install -e .

nitishgupta commented 4 years ago

But there isn't a setup.py for pip install ...

epwalsh commented 4 years ago

That's a good point 😏

Try setting the PYTHONPATH env variable then:

PYTHONPATH=./ FLASK_ENV=development python allennlp_demo/bidaf_elmo

nitishgupta commented 4 years ago

Great, I didn't know about the PYTHONPATH env variable.

So the error now is --

Traceback (most recent call last):
  File "allennlp_demo/bidaf_elmo/api.py", line 15, in <module>
    endpoint = BidafElmoModelEndpoint()
  File "allennlp_demo/bidaf_elmo/api.py", line 11, in __init__
    super().__init__(c)
  File "/Users/nitishgupta/code/allennlp-demo/api/allennlp_demo/common/http.py", line 82, in __init__
    self.predictor = Predictor.from_archive(archive, model.predictor_name)
  File "/Users/nitishgupta/miniconda3/envs/allennlp-demo-bidaf/lib/python3.7/site-packages/allennlp/predictors/predictor.py", line 307, in from_archive
    ) if predictor_name is not None else cls
  File "/Users/nitishgupta/miniconda3/envs/allennlp-demo-bidaf/lib/python3.7/site-packages/allennlp/common/registrable.py", line 137, in by_name
    subclass, constructor = cls.resolve_class_name(name)
  File "/Users/nitishgupta/miniconda3/envs/allennlp-demo-bidaf/lib/python3.7/site-packages/allennlp/common/registrable.py", line 185, in resolve_class_name
    f"{name} is not a registered name for {cls.__name__}. "
allennlp.common.checks.ConfigurationError: reading-comprehension is not a registered name for Predictor. You probably need to use the --include-package flag to load your custom code. Alternatively, you can specify your choices using fully-qualified paths, e.g. {"model": "my_module.models.MyModel"} in which case they will be automatically imported correctly.

BTW, when I ran the UI + API for other models, I faced the same issue. I am guessing the predictor not being accessible is a common issue for the other models I tried.

nitishgupta commented 4 years ago

Related -- I've faced a similar issue when using Predictor.from_archive otherwise; I get the same error until I import the required Predictor class in my python code (even if the import is unused). I've never understood why that's needed.

epwalsh commented 4 years ago

Do you have allennlp-models installed in the same Python environment?

matt-gardner commented 4 years ago

@epwalsh, sorry for being dense, this is me not really understanding docker-compose, and remembering things incorrectly. But, I just tried running the commands that you linked exactly as-is, and I see the same error that @nitishgupta is reporting. I tried accessing the UI from both port 3000 (as @nitishgupta did, I think) and from port 8080 (as the instructions say to do), and I get an error in both cases. From port 3000 it's a 404, and from port 8080 it's a 502.

nitishgupta commented 4 years ago

Okay, so with bidaf-elmo the issue was incorrect predictor name in model.json (reading-comprehension in demo, and reading_comprehension in allennlp-models). On fixing this the api.py does run and when I open the browser, I see a json output. Another issue though:

I killed the make bidaf_elmo-run command, and when I restart it, the docker run command is running, but when I run the api.py, the docker command gets killed at the reading_comprehension is not a registered name for Predictor

epwalsh commented 4 years ago

@nitishgupta looks like this is due to mismatch between the version of allennlp / allennlp-models in your local environment and the one that gets installed in the Docker image.

@matt-gardner I'll take to see if I can get that running.

nitishgupta commented 4 years ago

Okay; there is no Dockerfile in bidaf_elmo so I don't know which allennlp-models is installed in the Docker image. If it is allennlp-models==1.0.0rc3 as in the requirements.txt in the api directory, the predictor there is registered with reading_comprehension as well, so I'm not sure what is happening.

When I run the make nmn_drop-run, the Docker is running and on opening the localhost:8000 I see {"allennlp":"0.9.0-unreleased","archive_file":"https://storage.googleapis.com/allennlp-public-models/drop-nmn-2020.04.04.tar.gz","id":"nmn","overrides":{"model":{"beam_size":1,"debug":true}},"predictor_name":"drop_demo_predictor"}

Does it point to the fact that the model is getting loaded fine and the issue in the demo is somewhere else which I think you'll be looking into with the docker compose command? Sorry, I don't understand exactly how this is setup so fail to come up with next debugging steps; I'll probably just wait for an update from you.

allenai / allennlp-demo

Running a Local Environment fails #526