:warning: This tool is still under development and is NOT yet feature-complete. Expect breaking changes and bugs. Please report any issues.
Matricula Online is a website that hosts parish registers from various regions across Europe. This CLI tool allows you to fetch data from it and save the data to a file.
Our GitHub Workflow automatically scrapes a list with all parishes once a week and pushes to cache/parishes
. Download parishes.csv
⚡️
Note that this tool will not format or clean the data in any way. Instead, the data is saved as-is to a file. I mention this because the original data is especially poorly formatted and contains a lot of inconsistencies. It is up to the user to process the data further.
Make sure to have a recent version of Python installed. You can then install this script via pip
:
$ pip install --user matricula-online-scraper
Nevertheless, you can clone this repository and run the script with Poetry.
$ matricula-online-scraper --help
prints available commands and options, including documentation. Same goes for each subcommand, e.g. matricula-online-scraper fetch --help
.
The fetch
command is the primary command to fetch any resources from Matricula Online. Its subcommands allow you to scrape different resources, run matricula-online-scraper fetch --help
to see available subcommands.
Fetch all available locations and save them to a .jsonl
file:
$ matricula-online-scraper fetch locations ./output.jsonl
:warning: This will fetch all parishes from Matricula Online, which may take a few minutes. Despite that, this data only changes rarely, but frequent scraping will put unnecessary load on the server. Therefore our GitHub Workflow caches this data once a week and pushes to
cache/parishes
. ⚡️ Download CSV ⚡️
Fetch all available register from one parish in Münster, Germany and save them to a .jsonl
file:
$ matricula-online-scraper fetch parish ./output.jsonl --urls https://data.matricula-online.eu/en/deutschland/muenster/muenster-st-martini/
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions, especially bug fixes. Please make sure to follow the Contributing Guidelines.