FDA / openfda

openFDA is an FDA project to provide open APIs, raw data downloads, documentation and examples, and a developer community for an important collection of FDA public datasets.
https://open.fda.gov
Creative Commons Zero v1.0 Universal
569 stars 131 forks source link

Latest changes in openFDA that include a Docker Compose configuration to run the pipelines and API locally #133

Closed dkrylovsb closed 4 years ago

dkrylovsb commented 4 years ago

Running in Docker

If you intend to try and run openFDA yourself, we have put together a docker-compose.yml configuration that can help you get started. docker-compose up will:

  1. Start an Elasticsearch container
  2. Start an API container, which will expose port 8000 for queries.
  3. Start a Python 2.7 container that will run the NSDE, CAERS, and Substance Data pipelines and create corresponding indices in Elasticsearch.

Note: even though the API container starts right away, it will not serve any data until some or all of the pipelines above have finished running. You can curl http://localhost:8000/status to see which endpoints have become available as the pipelines progress or after they have completed running. Once an endpoint becomes available, it can be queried using the standard openFDA query syntax. For example: curl -g 'http://localhost:8000/food/event.json?search=products.industry_name:"Soft+Drink/Water"+AND+reactions.exact:DEHYDRATION&limit=10'

At this point the Python container only runs the NSDE, CAERS, and Substance Data pipelines because those are relatively lightweight and require no access to internal FDA networks. We will add more pipelines in case there is substantial interest from the community. However, the three pipelines above provide a good starting point into understanding openFDA internals and/or customizing openFDA.