[TOC]
Gregory is an AI system that uses Machine Learning and Natural Language Processing to track clinical research and identify papers which improves the wellbeing of patients.
Sources for research can be added by RSS feed or manually.
The output can be seen in a static site, using build.py
or via the api provided by the Django Rest Framework.
The docker compose file also includes a Metabase container which is used to build dashboards and manage notifications.
Sources can also be added to monitor Clinical Trials, in which case Gregory can notify a list of email subscribers.
For other integrations, the Django app provides RSS feeds with a live update of relevant research and newly posted clinical trials.
Clone the repository:
git clone <repository_url>
cd <repository_directory>
docker compose up -d
docker exec admin python manage.py makemigrations
docker exec admin python manage.py migrate
api.domain.etc
Log in to your DNS provider.
Add a new A record for api.domain.etc
pointing to your server's IP address.
mg.domain.etc
mg.domain.etc
:
.env
API Keys
..env
file..env
Developer Tools
and create an API client..env
file:
ORCID_CLIENT_ID=your_orcid_client_id
ORCID_CLIENT_SECRET=your_orcid_client_secret
DOMAIN_NAME=DOMAIN.COM
# Set this to the subdomain you configured with Mailgun. Example: mg.domain.com
EMAIL_DOMAIN=
# The SMTP server and credentials you are using. For example: smtp.eu.mailgun.org
# These variables are only needed if you plan to send notification emails
EMAIL_HOST=
EMAIL_HOST_PASSWORD=
EMAIL_HOST_PASSWORD=
EMAIL_HOST_USER=
# We use Mailgun by default on the newsletters, input your API key here
EMAIL_MAILGUN_API_URL=
EMAIL_PORT=587
EMAIL_USE_TLS='True'
# Where you cloned the repository
GREGORY_DIR=
# Set your postgres DB and credentials
POSTGRES_DB=
POSTGRES_PASSWORD=
POSTGRES_USER=
SECRET_KEY='Yeah well, you know, that is just, like, your DJANGO SECRET_KEY, man' # you should set this manually https://docs.djangoproject.com/en/4.0/ref/settings/#secret-key
sudo apt-get update
sudo apt-get install nginx
sudo nano /etc/nginx/sites-available/default
sudo nginx -t
sudo systemctl restart nginx
sudo apt-get install certbot python3-certbot-nginx
sudo certbot --nginx -d domain.etc -d www.domain.etc
sudo ufw allow 'Nginx Full'
sudo ufw enable
Sites
and click Create Site
.Teams
and click Create Team
.Teams
, select the team, and click Add User
.Sources
and click Add Source
.RSS
method and provide the necessary configuration.# Every 2 days at 8:00
0 8 */2 * * /usr/bin/docker exec admin python manage.py send_admin_summary
# Every Tuesday at 8:05
5 8 * * 2 docker exec admin python manage.py send_weekly_summary
# every 12 hours, at minute 25
25 */12 * * * /usr/bin/flock -n /tmp/pipeline /usr/bin/docker exec admin ./manage.py pipeline
python3 setup.py
.The script checks if you have all the requirements and run to help you setup the containers.
Once finished, login at https://api.DOMAIN.TLD/admin or wherever your reverse proxy is listening on.
*/3 * * * * /usr/bin/docker exec -t admin ./manage.py runcrons
#*/10 * * * * /usr/bin/docker exec -t admin ./manage.py get_takeaways
*/5 * * * * /usr/bin/flock -n /tmp/get_takeaways /usr/bin/docker exec admin ./manage.py get_takeaways
Most of the logic is inside Django, the admin container provides the Django Rest Framework, manages subscriptions, and sends emails.
The following subscriptions are available:
Admin digest
This is sent every 48 hours with the latest articles and their machine learning prediction. Allows the admin access to an Edit link where the article can be edited and tagged as relevant.
Weekly digest
This is sent every Tuesday, it lists the relevant articles discovered in the last week.
Clinical Trials
This is sent every 12 hours if a new clinical trial was posted.
The title of the email footer for these emails needs to be set in the Custom Settings section of the admin backoffice.
Django also allows you to add new sources from where to fetch articles. Take a look at /admin/gregory/sources/
Emails are sent from the admin
container using Mailgun.
To enable them, you will need a mailgun account, or you can replace them with another way to send emails.
You need to configure the relevant variables for this to work:
EMAIL_USE_TLS=true
EMAIL_MAILGUN_API='YOUR API KEY'
EMAIL_DOMAIN='YOURDOMAIN'
EMAIL_MAILGUN_API_URL="https://api.eu.mailgun.net/v3/YOURDOMAIN/messages"
As an alternative, you can configure Django to use any other email server.
Gregory has the concept of 'subject'. In this case, Multiple Sclerosis is the only subject configured. A Subject is a group of Sources and their respective articles. There are also categories that can be created. A category is a group of articles whose title matches at least one keyword in list for that category. Categories can include articles across subjects.
There are options to filter lists of articles by their category or subject in the format articles/category/<category>
and articles/subject/<subject>
where
/feed/latest/articles/
/feed/articles/subject/<subject>/
/feed/articles/category/<category>/
/feed/latest/trials/
/feed/machine-learning/
/feed/twitter/
. This includes all relevant articles by manual selection and machine learning prediction. It's read by Zapier so that we can post on twitter automatically.This is not working right now and there is a pull request to setup an automatic process to keep the machine learning models up to date.
It's useful to re-train the machine learning models once you have a good number of articles flagged as relevant.
cd docker-python; source .venv/bin/activate
python3 1_data_processor.py
python3 2_train_models.py
Edit the env.example file to fit your configuration and rename to .env
sudo docker-compose up -d
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
And the Lobsters at One Over Zero