Work through this guide to add a data source to Open Humans
If you have any questions or suggestions, or run into any issues with this demo/template, please let us know, either over in Github issues, or at our Slack channel where our growing community hangs out.
This repository is a template for, and working example of an Open Humans data source. If you want to add a data source to Open Humans, we strongly recommend following the steps in this document to work from this template repo.
index.html
can be edited for your specific project, but mostly this file can remain as it is, to enable user authenticationcomplete.html
add_data_to_open_humans
in the tasks.py
fileThis template is a Django/Celery app that enables the end user - an Open Humans member - to add dummy data to an Open Humans project. The user arrives on the landing page (index.html
), and clicks a button which takes them to Open Humans where they can log in (and create an account if necessary). Once logged in to the Open Humans site, the user clicks another button to authorize this app to add data to their Open Humans account, then they return to this app (to complete.html
) which notifies them that their data has been added and provides a link to the project summary page in Open Humans.
So let's get that demo working on your machine, and you should be able to complete those steps as a user by running the app, before moving on to edit the code so it adds your custom data source instead of a dummy file.
In your terminal, navigate to the folder in which you want to store this repo, and enter the command
git clone git@github.com:OpenHumans/oh-data-source-template.git
This should create a new folder named oh-data-source-template
which contains all the code to create the working demo.
The Heroku CLI is a local command line tool that helps us run and eventually deploy our application to Heroku. To install:
macOS:
brew install heroku/brew/heroku
Linux:
wget -qO- https://cli-assets.heroku.com/install-ubuntu.sh | sh
Windows:
Click the link and choose an installer to download
RabbitMQ is an open source message broker used by this application.
To install RabbitMQ you can follow these instructions, or if you are using a Mac and have Homebrew installed, you can simply type brew install rabbitmq
, followed by brew services start rabbitmq
, to set it running in the background. To set it running on very popular Ubuntu and other Debian based systems, it will likely be started for you after you install the package, but can also start it manually with: sudo rabbitmq-server start
.
For the current version of this template, you will need Python 2. We are working on an updated version to run on Python 3.
Please note that if you are working on a Mac, it is strongly advised that you install a fresh version of Python. The version that ships with OSX is not suitable for development and use with third party packages. Instructions for setting up your Python environment properly in OSX can be found here.
You will need to ensure that the python
alias points to a good Python 2 interpreter, and not Python 3 or the default OSX Python 2. First check that Python 2 opens when you type the command python
. You can change the alias, if necessary, by adding this line alias python=python2
to your, .profile
, .bash_profile
, or .bashrc
file.
pip is a package management system used to install and manage software packages written in Python. It is available here. If you are working on a Mac and have followed the above instructions for installing a fresh Python2
, make sure to use the command pip2
.
Virtual environments are useful when developing apps with lots of dependancies since they enable us to install software locally for a specific project, without it being present globally. Using a virtual environment allows us to use specific versions of each program and/or package for this project only, without affecting the versions that are used elsewhere on your machine.
We will set up the virtual environment here, and then work from within it for the remainder of this guide.
pip install pipenv
or pip2 install pipenv
pipenv --python 2.7.14
This command should output some information about how it's creating a virtual environment for us with some path information about it.
Whenever we use pip or python commands, this virtual environment will be used for the remainder of this tutorial.
You can install all dependencies with:
pipenv install
Note: This will install the dependencies for this project from the Pipfile and Pipfile.lock. If you have issues with installing all of the requirements at once, read the error(s) as there may be some other requirement missing locally (such as Postgres). If you still have problems, raise an issue on Github.
The environment file contains configurations for running the application, which both pipenv
and heroku
will use when running the application. It should never be committed to git and should be kept private as it contains secrets. First copy the contents of the template environment file, env.example
, paste into a new file, and save with the filename .env
(use cp env.example .env
) we will go back and alter the contents after creating a project on the Open Humans site. The .env
filename should already be in your .gitignore
, but it is worth double-checking to make sure.
Head to http://openhumans.org/direct-sharing/projects/manage to create an OAuth2 project in Open Humans. If you do not yet have an Open Humans account, you will need to create one first.
Create a new OAuth2 data request project
http://127.0.0.1:5000
, this should then automatically set the redirect URL to http://127.0.0.1:5000/complete
When you have created the project, you'll be able to click on its name in the project management page
to show its information. From here, get the activity page
, client ID
, and client secret
and set them in your .env
file. The ID and secret identify and authorize your app. They are used for user authorization and data management.
Keep your client secret private, it should not be committed to a repository. In Heroku it will be kept private as an environment variable, and locally it will be available from the .env
file which should not be committed to git.
Finally we need to initialize the database and static assets to be able to get the app running. Django will use SQLite3 by default if you do not specify a DATABASE_URL
in .env
. For more information on databases you can check out the Django docs.
In the main project directory, run the migrate
command followed by collectstatic
as follows:
pipenv run python manage.py migrate
pipenv run python manage.py collectstatic
Please note you can ignore the following warning message:
You have requested to collect static files at the destination location as specified in your settings:
development:~/your_project/staticfiles
This will overwrite existing files!
Are you sure you want to do this?
Now we are ready to run the app locally. Enter the command pipenv run heroku local
, and don't worry if you see the following warning:
warnings.warn('Using settings.DEBUG leads to a memory leak, never')
If you are curious, the cause of this warning is outlined here.
Now head over to http://127.0.0.1:5000 in your browser to see your app running. It should look like this:
Now you have your application built and running locally, we'll head over to Heroku where the app will be deployed remotely.
If you have hit any problems so far, please do let us know in Github issues or come and chat with us over at our Slack channel.
If you don't already have a Heroku account, head to http://www.heroku.com/ to create a free account now. If you are new to app development, you may also want to go through their getting started with Heroku/Python guide before continuing with your Open Humans app.
Make sure you have installed the Heroku command line interface, then, from your terminal, you can log in and create your app with the following commands:
heroku login
you will be asked for your Heroku credentials
heroku apps:create your-app-name
If you use Heroku's free default domain, this will be set by the name you choose here, i.e. you will have
https://your-app-name.herokuapp.com
In your browser, head over to http://dashboard.heroku.com/apps
and log in to see the app you just created.
Go to the resources
tab, and add the following Add-ons:
CloudAMQP
- a message queuing serviceHeroku Postgres
Next go to the settings
tab and add the environment variables as in the .env
file.
OH_CLIENT_ID
OH_CLIENT_SECRET
OH_ACTIVITY_PAGE
APP_BASE_URL
(e.g. https://your-app-name.herokuapp.com - no trailing dash!)SECRET_KEY
DEBUG
= true when neededHead back over to your terminal and run the following command to initialize and update your code remotely in Heroku:
git push heroku master
You can watch logs with the command heroku logs -t
.
To test out the app as a user, you can add dummy data to your project. First go to the url for your app (https://your-app-name.herokuapp.com), you should see the following page:
Click the button which will take you to Open Humans where you may have to log in. You should reach a page like this:
Click the button to authorize the demo app to add data to your Open Humans account. You will be directed back to your app which will complete the data transfer, this should look like this:
You can then click to return to Open Humans to check that the demo data has been successfully added:
Before starting to edit the code in this demo to create your own project, it may be useful to understand what the existing code is doing.
index.html
file), on this page is a button which transfers the user to Open Humans for authorizationcomplete.html
filecomplete
function in the file views.py
receives a code from Open Humans, exchanges it for a token, and uses this token to retrieve the project member ID which is stored in the OpenHumansMember
modelxfer_to_open_humans
xfer_to_open_humans
takes the member ID (which was retrieved during authentication over at the Open Humans site), and runs the method add_data_to_open_humans
add_data_to_open_humans
runs with three steps:make_example_datafile
method)delete_oh_file_by_name
method)upload_file_to_oh
method)upload_file_to_oh
performs the following steps:A note on asynchronosity:
The celery.py
file sets up asynchronous tasks for the app. The function xfer_to_open_humans
in tasks.py
is called (from complete
function in views.py
) asynchronously, by the presence of .delay
in the function call:
xfer_to_open_humans.delay(oh_id=oh_member.oh_id)
Sometimes uploads can take a long time so we advise using the delay so that this does not prevent the app from processing other events. However if you wish to run your app without asynchronosity, you can simply remove the .delay
component from this function call.
Now you have worked through to create a working demo, and should understand roughly how the demo works, you are ready to customise the code to create your own Open Humans data source. Use the code you have in this repository as a template for your app.
You are likely to want to start making changes in the tasks.py
file, which is where much of the logic is stored. Instead of generating a dummy data file you will want to think about how to get your own data into the app, whether it is a previously downloaded file, which needs to be processed and/or vetted by the app, or you are working from an external API.
Good luck, and please do get in touch to ask questions, give suggestions, or join in with our community chat!