The Open Data Hub Mobility Data Collectors, historically called also Big Data Platform data collectors, and also data providers where contained in this repo, therefore it is called common.
This repository contains the source code of all data collectors, that are Java workers that connect to a remote data pool, such as an API, MQTT broker, FTP server or their like, and download data, aggregate and enriches that, and finally send it to the Big Data Platform writer, which stores it inside a Postgres DB.
We use Keycloak for authentication against the Open Data Hub writer API.
Table of contents
These instructions will get you a copy of the project up and running on your
local machine for development and testing purposes. These are just general
guidelines, for specific details, refer to the README.md
file in each folder.
To build the data collector project, the following prerequisites must be met:
If you want to run the application using Docker, the environment is already set up with all dependencies for you. You only have to install Docker and Docker Compose and follow the instruction in the dedicated section.
Hint: To be sure to have the correct Java version and build environment equal to our infrastructure use the provided docker configuration.
Get a copy of the repository:
git clone https://github.com/noi-techpark/bdp-commons.git
Change directory:
cd bdp-commons/data-collectors/[your-collector]
Build the project:
mvn clean package
The unit tests can be executed with the following command:
mvn clean test
.env.example
to .env
and configure itdocker-compose up -d
docker-compose logs -f
Please, refer to the README.md
inside that folder for further details, and
report any incidence to help@opendatahub.com
.
Copy this file to .vsode/launch.json
:
{
"version": "0.2.0",
"configurations": [
{
"type": "java",
"name": "Attach",
"request": "attach",
"hostName": "0.0.0.0",
"port": "9000",
"justMyCode": false
}
]
}
Run docker-compose up -d
inside the data-collector folder of your choice, and
then launch Attach
from VSCode. You are now ready to set breakpoints and debug.
Change directory into the data collector you want.
You can set the parameters directly as environmental variables (see
.env.example
) and start it, as follows:
1) Newer data collectors are Spring Boot applications
mvn spring-boot:run
...or, if you want to use your personalized Spring profile:
cd data-collectors/[your-collector]
cp src/main/resources/application.properties src/main/resources/application-local.properties
# Now open src/main/resources/application-local.properties and modify values as you like
mvn -D spring.profiles.active=local spring-boot:run
2) Older data collectors are Spring applications with an additional tomcat maven plugin:
mvn tomcat:run \
-DPARAM1=... \
-DPARAM2=... \
-DPARAM3=...
...or, set them inside the relevant .properties
files directly (see the
corresponding README.md
for details), and run:
mvn tomcat:run
You do not need special credentials for local development. Use the following
Keycloak OAuth parameters inside application.properties
to get started
immediately (some data collectors have them already as defaults):
authorizationUri=https://auth.opendatahub.testingmachine.eu/auth
tokenUri=https://auth.opendatahub.testingmachine.eu/auth/realms/noi/protocol/openid-connect/token
BASE_URI=http://localhost:8999/json
clientId=odh-mobility-datacollector-development
clientName=odh-mobility-datacollector-development
clientSecret=7bd46f8f-c296-416d-a13d-dc81e68d0830
scope=openid
Or, find the corresponding variable names inside the specific .env
files of
each data collector, if you develop with docker. Unfortunately, these were not
standardized in the past.
If you want to test it on our infrastructure directly, please read about Credentials in our Contributor Guidelines.
data-collectors/helloworld/ci-helloworld.yml
to .github/workflows/ci-your-new-datacollector.yml
helloworld
with your-new-datacollector
data-collectors/your-new-datacollector
infrastructure/ansible/hosts
file${{ secrets.HELLOWORLD_SECRET_1 }}
To update a dependency in all data-collectors the quickversionbump scripts can be used.
Note: Read the comments in every script for further instructions
For support, please contact help@opendatahub.com.
If you want to write a new Data Collector:
1) Read and follow our [Getting Started] guidelines
2) Copy/paste the helloworld example in a new folder under data-collectors
, choose the name of your data collector for that folder
3) Find TODO
comments and follow their instructions
4) See and alter code inside SyncScheduler.java
5) Start the writer API locally and test everything:
[Getting Started]: https://github.com/noi-techpark/odh-docs/wiki/Contributor-Guidelines:-Getting-started
More documentation can be found at https://docs.opendatahub.com.
The code in this project is licensed under the GNU AFFERO GENERAL PUBLIC LICENSE Version 3 license.
See the LICENSE file for more information.
This project is REUSE compliant, more information about the usage of REUSE in NOI Techpark repositories can be found here.
Since the CI for this project checks for REUSE compliance you might find it useful to use a pre-commit hook checking for REUSE compliance locally. The pre-commit-config file in the repository root is already configured to check for REUSE compliance with help of the pre-commit tool.
Install the tool by running:
pip install pre-commit
Then install the pre-commit hook via the config file by running:
pre-commit install