karpathy / arxiv-sanity-preserver

Web interface for browsing, search and filtering recent arxiv submissions
http://www.arxiv-sanity.com/
MIT License
5.14k stars 1.33k forks source link

Add Dockerfile and supporting script #38

Open bskaggs opened 8 years ago

bskaggs commented 8 years ago

This adds a Dockerfile to build a docker image for running the web interface and supporting scripts.

All dynamic data is saved in the data directory, and symlinks are created from their normal location to this directory, so that no existing scripts need to be changed. At runtime, if a user chooses to mount a host directory as the data directory, a user is created with the same user id and group id of the owner of the directory; this ensures that all files are written with the proper owner.

If no command is specified at run time, the server starts on port 8080. If a command is specified, it will be run after the user is created, again to ensure the correct file ownership. In either case, both the secret key and the database are created if they don't yet exist in the data directory.

machawk1 commented 6 years ago

@bskaggs I run into an issue when I use your fork:

$ git clone https://github.com/bskaggs/arxiv-sanity-preserver
$ cd arxiv-sanity-preserver/
$ docker image build -t arxiv-sanity .
$ docker container run arxiv-sanity
Changing to root
/usr/local/lib/python2.7/site-packages/flask_limiter/extension.py:113: UserWarning: Use of the default `get_ipaddr` function is discouraged. Please refer to https://flask-limiter.readthedocs.org/#rate-limit-domain for the recommended configuration
  " for the recommended configuration", UserWarning
/usr/local/lib/python2.7/site-packages/flask_limiter/extension.py:640: UserWarning: global_limits was a badly name configuration since it is actually a default limit and not a  globally shared limit. Use default_limits if you want to provide a default or use application_limits  if you intend to really have a global shared limit
  " if you intend to really have a global shared limit", UserWarning
Namespace(num_results=200, port=8080, prod=True)
loading db.p...
Traceback (most recent call last):
  File "serve.py", line 380, in <module>
    db = pickle.load(open('db.p', 'rb'))
IOError: [Errno 2] No such file or directory: 'db.p'

Can you provide advice on getting your fork working within Docker (version 18.06.1-ce, build e68fc7a)?

bskaggs commented 6 years ago

I'm sorry, I haven't touched this in years. I think you have to run the processing pipeline to populate some files that are missing: https://github.com/karpathy/arxiv-sanity-preserver/blob/1682975ccca50f1a0582c806aacd07266f689061/README.md#processing-pipeline