Open-EO / openeo-hub

Source code for openEO Hub, a centralized platform to explore openEO back-end providers.
https://hub.openeo.org
Apache License 2.0
8 stars 3 forks source link

openEO Hub

This repository contains the source code for openEO Hub, a centralized platform to explore openEO back-end providers.

It is currently in a rather early stage of development.

Goals

openEO Hub tries to implement some ambitious ideas. It is aimed to be a platform that may allow users to:

Public API endpoints

The Hub provides its data via a RESTful API under https://hub.openeo.org/api. The following endpoints are intended to be used by the public:

Getting started

This app is deployed at https://hub.openeo.org/.

If you want to set it up yourself, follow these steps:

Requirements

Required is Node.js (at least version 12, as required by the mongodb driver) and a MongoDB server (at least version 3.6 to support field names that contain . or $, see here).

Database

  1. Install MongoDB, especially mongod (tested with v4.0.4)
  2. Start it (with write access to the dbpath) - e.g. sudo mongod --dbpath /var/lib/mongodb
  3. It should output waiting for connections on port 27017

Should you ever want to hard-reset the database (i.e. drop all collections openeo-hub created), use the drop script by calling node drop.js --yesimsure or npm run drop -- --yesimsure. By default, the script leaves collections with user-generated data (e.g. the submitted process graphs) intact. If you want to drop those too, add the --everything option.

Frontend and API backend

  1. Clone this repo, cd /path/to/openeo-hub/
  2. npm install -> wait...
  3. Edit config.json:
    • Specify the URL and name of your MongoDB server and database (required)
    • Specify the backends to crawl (required). This happens via an object with display names as the keys and URLs as the values. The display name is only shown if a backend does not supply one itself. The URLs MUST point to an openEO service that supports well-known discovery, but the specified URL itself MUST NOT contain the trailing /.well-known/openeo. The URLs may or may not have a trailing slash.
    • Optional: Change presets for thresholds that control how the crawler handles existing data that is not reachable on re-crawl
  4. npm run crawl -> wait until finished with output "DONE!" (see below if something doesn't look right or any line starts with "An error...")
  5. npm start
  6. Go to http://localhost:9000/

Troubleshooting

If errors occur during crawling, this is probably caused by one of the crawled backends (a) returning JSON that is not compliant to the openEO API specification, or (b) malfunctioning under the load of many requests in quick succession. In the first case (a), the --verbose option may be helpful to locate the error (be sure to pass the option to the script and not to NPM, i.e. call node crawl.js --verbose or npm run crawl -- --verbose).

Scheduling re-crawling

On Linux systems, you can use the cron daemon to schedule recurring crawling. For example, adding the following line to /etc/crontab executes the crawl script every night at 3:00 am, as the user johndoe: 0 3 * * * johndoe node /path/to/openeo-hub/crawl.js

Development

There are several start scripts for different dev scenarios:

The Hub depends on the openeo-vue-components repo - if you're simultaneously working on that too and want to see how your changes there work together with the Hub, it's smart to link it:

  1. cd /path/to/openeo-vue-components
  2. sudo ndm link
  3. cd /path/to/openeo-hub/
  4. npm link @openeo/vue-components

This makes all references to @openeo/vue-components in imports etc. point to your current local state of that repo.

Note these caveats: