Hevia / EarthKB

A pipeline for turning Earth & Life science documents(images, videos, academic papers, news articles) into a full stack neural searchable knowledge base
GNU General Public License v3.0
0 stars 0 forks source link

EarthKG

A pipeline for turning Earth & Life science documents into a searchable knowledge base to aid researchers generate new hypotheses.

Features

Planned Features

Setup

Installing dependencies

You should use the provided Dockerfiles for development, but in the case you rather install locally. You can

Windows: Read this guide on how to install poppler for windows: https://stackoverflow.com/questions/18381713/how-to-install-poppler-on-windows (required for mmda)

python -m venv wvenv # Create a virtual environment
. .\wvenv\Scripts\activate # Activate it
pip install -r requirements.txt # Install requirements

Linux:

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

All Systems:

>>> python # start a python repl in your command prompt
>>> import nltk
>>> nltk.download('wordnet')
>>> nltk.download('omw-1.4')
python -m spacy download en_core_web_sm # Install the spacy language model you want to use

Getting the data