This repo contains several components:
Available API calls:
https://ws.spraakbanken.gu.se/ws/metadata
: List all resources in three dictionaries (corpora
, lexicons
, and models
)https://ws.spraakbanken.gu.se/ws/metadata/corpora
: List all corporahttps://ws.spraakbanken.gu.se/ws/metadata/lexicons
: List all lexiconshttps://ws.spraakbanken.gu.se/ws/metadata/models
: List all modelshttps://ws.spraakbanken.gu.se/ws/metadata/collections
: List all collectionshttps://ws.spraakbanken.gu.se/ws/metadata?resource=saldo
: List one specific resource. Add long description from SVN (if available)https://ws.spraakbanken.gu.se/ws/metadata/list_ids
: List all existing resource IDshttps://ws.spraakbanken.gu.se/ws/metadata/check-id-availability?id=my-resource
: Check if a given resource ID is freehttps://ws.spraakbanken.gu.se/ws/metadata/doc
: Serve API documentation as YAMLhttps://ws.spraakbanken.gu.se/ws/metadata/renew-cache
: Flush cache and fill with fresh valuesInstall requirements from requirements.txt
, e.g. with a (virtual environment):
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Get an initial copy of the metadata files:
git clone https://github.com/spraakbanken/metadata.git
Add entry in supervisord config:
[program:metadata]
command=%(ENV_HOME)s/metadata-api/venv/bin/gunicorn --chdir %(ENV_HOME)s/metadata-api -b "0.0.0.0:1337" metadata_api:create_app()
Set up cron job that periodically runs the update script which
The following cron job is run on fksbwww@k2:
# Update sb-metadata from GitHub and restart if needed
50 * * * * cd /home/fksbwww/metadata-api && ./update_metadata.sh > /dev/null
A collection is a "meta" metadata entry which is used to summarize multiple resources. Collections are supplied as YAML files. The resource-IDs belonging to a collection can either be supplied as a list in the YAML (with the 'resources' key) or each resource can state which collection(s) it belongs to in its YAML (with the 'in_collections' key which holds a list of collection IDs). The size of the collection is calculated automatically. A collection may have a resource description in its YAML.
Resources with the attribute "unlisted": true
will not be listed in the data list on the web page, but they can be
accessed directly via their URL. This is used as a quick and dirty versioning system.
The successors
attribute can be used for resources that have been superseded by one or more other resources (e.g.
newer versions). This attribute holds a list of resource IDs.
For documentation see the code comments and the /docs directory.
The Datacite login credentials are store in a .netrc file located in /home/fksbwww on the server.