The core component of the ARCHE repository solution responsible for the CRUD operations and transaction support.
composer require acdh-oeaw/arche-core
See https://github.com/acdh-oeaw/arche-docker
An environment allowing you to edit code in your host system and run all the tests inside a docker container.
git clone https://github.com/acdh-oeaw/arche-core.git
cd arche-core
composer update
docker build -t arche-dev build/docker
docker run --name arche-dev -v `pwd`:/var/www/html -e USER_UID=`id -u` -e USER_GID=`id -g` -d arche-dev
docker logs -f arche-dev
wait until you see (timestamps will obviously differ):
2020-06-04 14:06:52,309 INFO success: apache2 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2020-06-04 14:06:52,309 INFO success: postgresql entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2020-06-04 14:06:52,309 INFO success: rabbitmq entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2020-06-04 14:06:52,309 INFO success: tika entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
then hit CTRL+c
docker exec -ti -u www-data arche-dev /bin/bash
and then inside the container
XDEBUG_MODE=coverage vendor/bin/phpunit
Remarks:
docker exec arche-dev a2dissite mod_php
docker exec arche-dev a2ensite php_fpm
docker exec -w /root arche-dev supervisorctl restart apache2
Similarly to get back to the mod_php config:
docker exec arche-dev a2dissite php_fpm
docker exec arche-dev a2ensite mod_php
docker exec -w /root arche-dev supervisorctl restart apache2
The main table is the resources
one. It stores a list of all repository resources identified by their internal repo id (the id
column) as well as transactions handling related data (columns transaction_id
and state
).
Metadata are devided into three tables according to the consistency checks applying to them.
identifiers
table stores resources' identifiers (the repository assumes every resource may have many). The table enforces global identifiers uniquness. The RDF property storing the identifier comes implicitly from the repository's config.yaml
($.schema.id
) and is not explicitly stored inside the database.relations
table stores all RDF triples having an URI as an object. It enforces (with a foreign key check) existence of a repository resource an RDF triple points to.metadata
table stores all other RDF triples. This table puts no constraints on the data. Triples are stored in an RDF-like way - each row in the table represents a single triple.
value_n
/value_t
column stores a value casted to number/timestamp. This allows for correct comparisons which would fail against string values.value
column is set up only on first 1000 characters of the value. This is both for technical and performance reasons. An important consequence is that if you want to benefit from indexed search on the value column, you should state your condition as substring(value, 1, 1000) = 'yourValue'
.Supplementary tables include:
transactions
table which stores information about pending transactions.metadata_history
table which stores history of metadata modification. It's automatically filled in using triggers on tables identifiers
, relations
and metadata
.full_text_search
table storing a GIST index on a tokenized metadata values and resources' text content allowing for a full text search (see the Postgresql documentation).spatial_search
table storing vector spatial data as PostGIS geography allowing for spatial searches (see the PostGIS documentation).raw
table is used only for data migration from the previous ACDH-CH repository solution.metadata_view
gathers together triples from both identifiers
, relations
and metadata
tables.get_relatives()
function allows easy finding of resources related to a given one with a given RDF property. Internally it uses a recursive query which could be difficult to write correctly on you own.get_neighbors_metadata()
and the get_relatives_metadata()
functions allow for easy fetching of metadata triples of bot a given resource and resources related to it. Either by any single-hop RDF property (get_neighbors_metadata()
) or with any number of hops of a one selected metadata property (get_relatives_metadata()
).