italia / public-opendata-sources

A (complete) list of Italian public open data sources.
https://italia.github.io/public-opendata-sources/
Other
18 stars 7 forks source link
opendata

Italian Public Open Data Sources

Data and open data on forum.italia.it Public opendata sources on forum.italia.it

Join the #pdnd-ckan channel Get invited

This repository aims to collect and share an updated list of Italian public open data sources as complete as possible in both human- and machine-readable formats.

It is also the official repository of harvesting sources for CKAN-IT when configured as open data harvester, ie. within the Piattaforma Digitale Nazionale Dati (PDND) - previously DAF. CKAN-IT provides everything you need to run CKAN plus a set of extensions for supporting Italian open data in a set of Docker images. If you are interested in an open data catalogue up and running in minutes, see italia/ckan-it.

Organizations and harvesting sources

There are two entities: an organization and a harvesting source. Both are described by a json file compliant with the schemas provided. In orgs/ folder there are all organizations. In sources/ folder, all harvesting sources.

Before importing or just after exporting you should check compliance with schemas in schemas/ folder using two provided scripts. You must have Python3 and the jsonschema module installed in your system.

If you want to run them in a virtual environment, a Pipfile is provided to use with pipenv. If you have pipenv, just run pipenv shell and then pipenv install before launch the validation scripts.

You can combine all entities from orgs/ and sources/ folders in single json files using export_all.py script that creates two files in dist/ folder:

To validate them you can use the provided validate_all.sh script.

Naming convention

Organizations:

Harvesting sources:

Harvester types

The following different types of harvesters are currently supported.

How to import orgs and sources in CKAN-IT

Install, setup and run CKAN from the official repository.

If you are ok with the official docker images provided, simply run them setting the environment variable CKAN_HARVEST="true". Read more here for details.

Otherwise you can manually use the import_all.sh script on a running instance of CKAN-IT.

  1. Run bash import_all.sh APIKEY HOST where APIKEY is the API key of your admin user (read more here for details) and HOST is the CKAN host (ie. localhost:5000).
  2. Browse to http://localhost:5000/organization to check all imported organizations
  3. Browse to http://localhost:5000/harvest to check all imported sources

Now follow these steps to run CKAN-IT harvesting process.

How to export your orgs and sources

If you are running a CKAN-IT instance with many harvesting sources defined (ie. using the web interface), you can export them all using export_orgs.py and export_sources.py scripts. You must have Python3 installed or use pipenv with provided Pipfile.

How to contribute

Contributions are welcome. Feel free to open issues and submit a pull request at any time!