This is a Singer tap that produces JSON-formatted data from the GitHub API following the Singer spec.
This tap:
Install
We recommend using a virtualenv:
> virtualenv -p python3 venv
> source venv/bin/activate
> pip install tap-github
Create a GitHub access token
Login to your GitHub account, go to the
Personal Access Tokens settings
page, and generate a new token with at least the repo
scope. Save this
access token, you'll need it for the next step.
Create the config file
Create a JSON file containing the start date, access token you just created
and the path to one or multiple repositories that you want to extract data from. Each repo path should be space delimited. The repo path is relative to "base_url"
(Default: https://github.com/
). For example the path for this repository is
singer-io/tap-github
. You can also add request timeout to set the timeout for requests which is an optional parameter with default value of 300 seconds.
{
"access_token": "your-access-token",
"repository": "singer-io/tap-github singer-io/getting-started",
"start_date": "2021-01-01T00:00:00Z",
"request_timeout": 300,
"base_url": "https://api.github.com"
}
Run the tap in discovery mode to get properties.json file
tap-github --config config.json --discover > properties.json
In the properties.json file, select the streams to sync
Each stream in the properties.json file has a "schema" entry. To select a stream to sync, add "selected": true
to that stream's "schema" entry. For example, to sync the pull_requests stream:
...
"tap_stream_id": "pull_requests",
"schema": {
"selected": true,
"properties": {
"updated_at": {
"format": "date-time",
"type": [
"null",
"string"
]
}
...
Run the application
tap-github
can be run with:
tap-github --config config.json --properties properties.json
Copyright © 2018 Stitch