ConnectedPlacesCatapult / TomboloDigitalConnector

The Tombolo Digital Connector enables users to combine different sources of data in a transparent and reproducible way.
MIT License
58 stars 30 forks source link
data-acquisiton data-aggregation data-science geojson java json urban-computing urban-data urban-data-science

Tombolo

Tombolo Digital Connector

wercker status

The Tombolo Digital Connector is an open source tool that enables users to seamlessly combine different sources of datasets in an efficient, transparent and reproducible way.

There are three particularly important parts to the Tombolo Digital Connector:

For further information see the documentation.

Table of Contents:

The Challenge

Contributing

Looking to get involved? Have a look at the Open Source Community milestone where we have selected low hanging fruit for you to easily get involved and contribute. Read our Guide to contribution for details.

Requirements

To get started you will need to install the requirements to run the Digital Connector.

Note: you’ll need to have administrator rights on your machine to install these - make sure that you do before you proceed.

Install the following via the link through to their installation page:

After the successful installation of the requirements, you can use the Digital Connector by following the instructions in the quick start section or by going through the intro tutorial in the documentation.

Installation Guides

Quick start

This tutorial will guide you to a quick start on macOS.

A note about the Terminal

The Terminal application can be found in the Applications -> Utilities folder or quickly accessed through Spotlight. It is pre-installed in macOS so there is no need to install it.

You will need this application to run some of the commands of this tutorial. When you enter a command and press return/enter, the terminal will execute it and complete the task.

Make sure to press return after typing a command before you enter the next one.

Let's start

If successful the final output will be as the following.

$ gradle test
:compileJava UP-TO-DATE
:processResources UP-TO-DATE
:classes UP-TO-DATE
:compileTestJava UP-TO-DATE
:processTestResources UP-TO-DATE
:testClasses UP-TO-DATE
> Building 85% > :test > 50 tests completed
:test

BUILD SUCCESSFUL

Total time: 4 mins 50.919 secs

If the tests start to fail then check the PostgreSQL server is running and the requirements are properly installed by going through the previous steps.

About to be mentioned a couple of examples of what might have gone wrong in the process if the tests start failing.

uk.org.tombolo.core.AttributeTest > testUniqueLabel FAILED
    java.util.ServiceConfigurationError
        Caused by: org.hibernate.service.spi.ServiceException
            Caused by: org.hibernate.exception.JDBCConnectionException
                Caused by: org.postgresql.util.PSQLException
                    Caused by: java.net.ConnectException

uk.org.tombolo.core.AttributeTest > testWriteJSON FAILED
    java.util.ServiceConfigurationError
        Caused by: org.hibernate.service.spi.ServiceException
            Caused by: org.hibernate.exception.JDBCConnectionException
                Caused by: org.postgresql.util.PSQLException
                    Caused by: java.net.ConnectException

uk.org.tombolo.core.DatasourceTest > testWriteJSON FAILED
    java.util.ServiceConfigurationError
        Caused by: org.hibernate.service.spi.ServiceException
            Caused by: org.hibernate.exception.JDBCConnectionException
                Caused by: org.postgresql.util.PSQLException
                    Caused by: java.net.ConnectException

The former error log is launched if the server is not running and to solve it you need to run the command.

pg_ctl -D /usr/local/var/postgres -l /usr/local/var/postgres/server.log start

OR if you did not set up the tombolo_test database.

In case you see this other error instead, it means that you did not rename the settings files successfully.

FAILURE: Build failed with an exception.

* Where:
Build file '/TomboloDigitalConnector/build.gradle' line: 159

* What went wrong:
Execution failed for task ':test'.
> Test environment not configured. See the README.

If you see other errors, try to go back and follow the steps again.

Run the Digital Connector

Now you are all set to run a task on the Digital Connector.

The next step is to run an example to show how the digital connector combines different data sets. We’re using an example that shows the relationship between air pollution (demonstrated in this example by NO2 levels), and car and bicycle traffic in every borough in London. You can read more about this example here.

When you’ve run this example, you can expect a map that looks like this:

Final Output

To get started:

We need your feedback!
If you have any issues with setting up the tool, or running the tutorial, or if you have some advice about how we can do this better, please contact us by creating an issue. Our goal is for someone to get back to you within 24 hours.

See also:

Run tests

gradle test

If you use the IntelliJ JUnit test runner, you will need to add the following to your VM Options in your JUnit configuration (Run -> Edit Configurations -> All under JUnit, and Defaults -> JUnit):

-enableassertions
-disableassertions:org.geotools...
-Denvironment=test
-DdatabaseURI=jdbc:postgresql://localhost:5432/tombolo_test
-DdatabaseUsername=tombolo_test
-DdatabasePassword=tombolo_test

Local deploy

To deploy to your local Maven installation (~/.m2 by default):

gradle install

Run Tasks

Run export

We use the Gradle task runExport to run exports. The parameters are as follows:

gradle runExport -Precipe='path/to/spec/file.json' -Poutput='output_file.json' -Pforce='com.className' -Pclear=true

For example, this calculates the proportion of cycle traffic received at a traffic counter relative to the total traffic in a given borough and outputs the results to the file reaggregate-traffic-count-to-la.json:

gradle runExport -Precipe='src/main/resources/executions/examples/reaggregate-traffic-count-to-la.json' -Poutput='reaggregate-traffic-count-to-la_output.json'

Export data catalogue

We use the Gradle task exportCatalogue to export a JSON file detailing the capabilities of the connector and explore the data catalogue.

gradle exportCatalogue -Poutput=catalogue.json

Importer Info

We use the Gradle task info to get details about a specific importer

gradle info

would give you list of all the Importers available in Digital Connector

gradle info -Pi='uk.org.tombolo.importer.dft.TrafficCountImporter'

Lists all the details of the Importer like Provider, SubjectTypes, Attributes, Datasourceids, Dataurl

gradle info -Pp -Pi='uk.org.tombolo.importer.dft.TrafficCountImporter'

would give user Datasourceids, Dataurl and Provider. Other option like -Pa and -Ps will give Attributes and SubjectType respectively.

Note: Datasourceids and Dataurl will always be provided irrespective of the option given.

Start/Stop server

If you need to start or stop the server (on MacOS X), use the following commands.

# to start
pg_ctl -D /usr/local/var/postgres -l /usr/local/var/postgres/server.log start

# to stop
pg_ctl -D /usr/local/var/postgres -l /usr/local/var/postgres/server.log stop

Implementations

License

MIT

When using the Tombolo or other GitHub logos and artwork, be sure to follow the GitHub logo guidelines.