E-ARK-Software / earkweb

E-ARK Web is a software for the creation and management of archival information packages, and it supports full-text search for individual files contained in them.
MIT License
20 stars 6 forks source link
archiving ingest repository

earkweb

Introduction

earkweb is a repository for archiving digital objects. It offers basic functions for ingest, management and dissemination of information packages.

Software architecture

earkweb consists of a frontend web application together with a task execution system based on Celery which allows synchronous and asynchronous processing of information packages by means of processing units which are called “tasks”.

The following diagram illustrates the component architecture.

architecture overview lightweight version

The user interface represented by the box on top of the diagram is a Python/Django-based web application which supports creation, management and exploration of information packages. Tasks can be assigned to Celery workers (green boxes with a "C") which share the same storage area and the result of the package transformation is stored in the information package’s working directory based on files. Full-text content included in information packages is indexed by SolR. A ResourceSync interface exposes the changelist of information packages managed by the repository.

Installation

User guide