See #3 for some details on what Matricula hosts and how things are organized as well as terminology.
The following command scrapes all parishes available to Matricula (depending on the optional search/filter parameters):
$ matricula-online-scraper fetch location -e csv
This returns a list with > 8000 entries. Here's the head of the output:
country ,region ,name ,url
Slovenia ,Nadškofija Maribor ,001 Apače ,https://data.matricula-online.eu/en/slovenia/maribor/apace/
Slovenia ,Nadškofija Maribor ,002 Artiče ,https://data.matricula-online.eu/en/slovenia/maribor/artice/
Slovenia ,Nadškofija Maribor ,004 Bele Vode ,https://data.matricula-online.eu/en/slovenia/maribor/bele-vode/
Slovenia ,Nadškofija Maribor ,005 Beltinci ,https://data.matricula-online.eu/en/slovenia/maribor/beltinci/
Slovenia ,Nadškofija Maribor ,006 Bizeljsko ,https://data.matricula-online.eu/en/slovenia/maribor/bizeljsko/
Taking the output of the first command, i.e. the urls, we can pipe it to the second one. This following command then scrapes all available sources of a parish. For 001 Apače:
Description
See #3 for some details on what Matricula hosts and how things are organized as well as terminology.
The following command scrapes all parishes available to Matricula (depending on the optional search/filter parameters):
This returns a list with > 8000 entries. Here's the head of the output:
Taking the output of the first command, i.e. the
url
s, we can pipe it to the second one. This following command then scrapes all available sources of a parish. For 001 Apače:This returns a list with all available digitized sources of a parish. Here's the head of the output:
I advocate for changing the names of the subcommands to match them better to the entities of Matricula (= more intuitive):
fetch location
becomeslist parishes
which can be used likelist parishes --all
orlist parishes --filter-place "name"
fetch parish
becomeslist sources
which can be used likelist sources --parish … --parish …
get source
which can be used likeget source --url … --url …
Affected Versions
All including the most recent one v0.3.0
This proposes a breaking change!