WIPACrepo / file_catalog

Store file metadata information in a file catalog
MIT License
1 stars 4 forks source link

PyPI GitHub release (latest by date including pre-releases) PyPI - License Lines of code GitHub issues GitHub pull requests

file_catalog

Store file metadata information in a file catalog

Prerequisites

To get the prerequisites necessary for the file catalog:

pip install -r requirements.txt

Running the server

To start an instance of the server running:

python -m file_catalog

Configuration

All configuration is done using environment variables. To get the list of possible configuration parameters and their defaults, run

python -m file_catalog --show-config-spec

Interface

The primary interface is an HTTP server. TLS and other security hardening mechanisms are handled by a reverse proxy server as for normal web applications.

Browser

Requests to the main url / are browsable like a standard website. They will use javascript to activate the REST API as necessary.

REST API

Requests with urls of the form /api/RESOURCE can access the REST API. Responses are in HAL JSON format.

File-Entry Fields

File-Metadata Schema:

Mandatory Fields:

Route: /api/files

Resource representing the collection of all files in the catalog.

Method: GET

Obtain list of files

REST-Query Parameters
HTTP Response Status Codes

Method: POST

Create a new file or add a replica

If a file exists and the checksum is the same, a replica is added. If the checksum is different a conflict error is returned.

REST-Body
HTTP Response Status Codes

Method: DELETE

Not supported

Method: PUT

Not supported

Method: PATCH

Not supported

Route: /api/files/{uuid}

Resource representing the metadata for a file in the file catalog.

Method: GET

Obtain file metadata information

REST-Query Parameters
HTTP Response Status Codes

Method: POST

Not supported

Method: DELETE

Delete the metadata for the file

REST-Query Parameters
HTTP Response Status Codes

Method: PUT

Fully update/replace file metadata information

REST-Body
HTTP Response Status Codes

Method: PATCH

Partially update/replace file metadata information

The JSON provided as body to PATCH need not contain all the keys, only the need to be updated. If a key is provided with a value null, then that key can be removed from the metadata.

REST-Body
HTTP Response Status Codes

More About REST-Query Parameters

limit
start
query
keys
max_time_ms
Shortcut Parameters: logical-name-regex, logical_name, directory, filename

In decreasing order of precedence...

Shortcut Parameter: run_number
Shortcut Parameter: dataset
Shortcut Parameter: event_id
Shortcut Parameter: processing_level
Shortcut Parameter: season
Shortcut Parameter: all-keys

Development

Establishing a development environment

Follow these steps to get a development environment for the File Catalog:

cd ~/projects
git clone git@github.com:WIPACrepo/file_catalog.git
cd file_catalog
./setupenv.sh

MongoDB Instance for Testing

This command will spin up a disposable MongoDB instance using Docker:

docker run \
    --detach \
    --name test-mongo \
    --network=host \
    --rm \
    circleci/mongo:latest-ram

Building a Docker container

The following commands will create a Docker container for the file-catalog:

docker build -t file-catalog:{version} -f Dockerfile .
docker image tag file-catalog:{version} file-catalog:latest

Where {version} is found in file_catalog/initpy; e.g.:

__version__ = '1.2.0'       # For {version} use: 1.2.0

Pushing Docker containers to local registry in Kubernetes

Here are some commands to get the Docker container pushed to our Docker register in our Kubernetes cluster:

kubectl -n kube-system port-forward $(kubectl get pods --namespace kube-system -l "app=docker-registry,release=docker-registry" -o jsonpath="{.items[0].metadata.name}") 5000:5000 &
docker tag file-catalog:{version} localhost:5000/file-catalog:{version}
docker push localhost:5000/file-catalog:{version}