UW-xDD / blackstack

Entity extraction from PDFs with Tesseract and Machine Learning
MIT License
11 stars 5 forks source link

WIP: add Dockerfile and docker-compose file to run this with a postgres for testing purposes #3

Closed metazool closed 6 years ago

metazool commented 6 years ago

My test is not working and I've not dug into why but perhaps it's because the sample PDF does not contain any tables or figures? Anyway your thoughts on the value of doing this would be appreciated

"add Dockerfile and docker-compose file to run this with 'docker-compo…se up' with a self-contained postgres which loads the example model on startup. Add a config file which can be populated from the environment passed to the container at turuntime.

Tweak the python in a few places to make the print statements python3 compatible (!)

Add shapely to the requirements and import psycopg2 somewhere it was needed"

jczaplew commented 6 years ago

Awesome! I was thinking that a Docker container for this would be prudent because of the dependencies but I don't have a ton of experience with it.

Yeah...I still run Python 2. This is a good excuse to finally upgrade.