oduwsdl / MemGator

A Memento Aggregator CLI and Server in Go
https://memgator.cs.odu.edu/api.html
MIT License
57 stars 11 forks source link
memento memento-rfc timemap web-archiving

MemGator

A Memento Aggregator CLI and Server in Go.

Features

Usage

CLI

Command line interface of MemGator allows retrieval of the TimeMap and the description of the closest Memento (equivalent to the TimeGate) over STDOUT in all supported formats. Logs and benchmarks (in verbose mode) and Error output are available on STDERR unless appropriate files are configured. For further details, see the full usage.

$ memgator [options] {URI-R}                            # TimeMap from CLI
$ memgator [options] {URI-R} {YYYY[MM[DD[hh[mm[ss]]]]]} # Description of the closest Memento from CLI

Server

When run as a Web Service, MemGator exposes following customizable endpoints:

$ memgator [options] server
TimeMap:  http://localhost:1208/timemap/{FORMAT}/{URI-R}
TimeGate: http://localhost:1208/timegate/{URI-R} [Accept-Datetime]
Memento:  http://localhost:1208/memento[/{FORMAT}|proxy]/{DATETIME}/{URI-R}
About:    http://localhost:1208/about
Monitor:  http://localhost:1208/monitor - (Over SSE, if enabled)

  {FORMAT}          => link|json|cdxj
  {DATETIME}        => YYYY[MM[DD[hh[mm[ss]]]]]
  [Accept-Datetime] => Header in RFC1123 format

NOTE: A fallback endpoint /api is added for compatibility with Time Travel APIs to allow drop-in replacement in existing tools. This endpoint is an alias to the /memento endpoint that returns the description of a Memento.

Download and Install

Depending on the machine and operating system download appropriate binary from the releases page. Change the mode of the file to executable chmod +x MemGator-BINARY. Run from the current location of the downloaded binary or rename it to memgator and move it into a directory that is in the PATH (such as /usr/local/bin/) to make it available as a command.

Running as a Docker Container

Build a Docker image locally from the source.

$ git clone https://github.com/oduwsdl/MemGator.git
$ cd MemGator
$ docker image build -t oduwsdl/memgator .

Alternatively, pull a published image from one of the two Docker image registries below:

$ docker image pull docker.pkg.github.com/oduwsdl/memgator/memgator
$ docker image pull oduwsdl/memgator

Run MemGator with various options inside a Docker container.

$ docker container run -it --rm oduwsdl/memgator -h
$ docker container run -it --rm oduwsdl/memgator [options] {URI-R}
$ docker container run -it --rm oduwsdl/memgator [options] {URI-R} {YYYY[MM[DD[hh[mm[ss]]]]]}
$ docker container run -d --name=memgator-server -p 1208:1208 oduwsdl/memgator [options] server
$ curl -i http://localhost:1208/about
$ docker container rm -f memgator-server

Full Usage

   _____                  _______       __
  /     \  _____  _____  / _____/______/  |___________
 /  Y Y  \/  __ \/     \/  \  ___\__  \   _/ _ \_   _ \
/   | |   \  ___/  Y Y  \   \_\  \/ __ |  | |_| |  | \/
\__/___\__/\____\__|_|__/\_______/_____|__|\___/|__|

# MemGator ({Version})

A Memento Aggregator CLI and Server in Go

Usage:
  memgator [options] {URI-R}                            # TimeMap from CLI
  memgator [options] {URI-R} {YYYY[MM[DD[hh[mm[ss]]]]]} # Description of the closest Memento from CLI
  memgator [options] server                             # Run as a Web Service

Options:
  -A, --agent=MemGator/{Version} <{CONTACT}>  User-agent string sent to archives
  -a, --arcs=https://git.io/archives          Local/remote JSON file path/URL for list of archives
  -b, --benchmark=                            Benchmark file location - defaults to Logfile
  -c, --contact=https://git.io/MemGator       Comment/Email/URL/Handle - used in the user-agent
  -D, --static=                               Directory path to serve static assets from
  -d, --dormant=15m0s                         Dormant period after consecutive failures
  -F, --tolerance=-1                          Failure tolerance limit for each archive
  -f, --format=Link                           Output format - Link/JSON/CDXJ
  -H, --host=localhost                        Host name - only used in web service mode
  -k, --topk=-1                               Aggregate only top k archives based on probability
  -l, --log=                                  Log file location - defaults to STDERR
  -m, --monitor=false                         Benchmark monitoring via SSE
  -P, --proxy=http://{HOST}[:{PORT}]{ROOT}    Proxy URL - defaults to host, port, and root
  -p, --port=1208                             Port number - only used in web service mode
  -R, --root=/                                Service root path prefix
  -r, --restimeout=1m0s                       Response timeout for each archive
  -S, --spoof=false                           Spoof each request with a random user-agent
  -T, --hdrtimeout=30s                        Header timeout for each archive
  -t, --contimeout=5s                         Connection timeout for each archive
  -V, --verbose=false                         Show Info and Profiling messages on STDERR
  -v, --version=false                         Show name and version

Build

Assuming that Git and Go (version >= 1.14) are installed. Cloning, running, building, and installing the code can be done using following commands:

$ git clone https://github.com/oduwsdl/MemGator.git
$ cd MemGator
$ go run main.go
$ go build
$ go install
$ memgator --help
$ memgator http://example.com/

To compile cross-platform binaries run the crossbuild.sh script:

$ ./crossbuild.sh

This will generate binaries for various OSes and Architectures in /tmp/mgbins directory.

Citing Project

A publication related to this project appeared in the proceedings of JCDL 2016 (Read the PDF). Please cite it as below:

Sawood Alam and Michael L. Nelson. MemGator - A Portable Concurrent Memento Aggregator: Cross-Platform CLI and Server Binaries in Go. In Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, JCDL 2016, pp. 243-244, Newark, New Jersey, USA, June 2016.

@inproceedings{jcdl-2016:alam:memgator,
  author    = {Sawood Alam and
               Michael L. Nelson},
  title     = {{MemGator - A Portable Concurrent Memento Aggregator}},
  booktitle = {Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries},
  series    = {JCDL '16},
  year      = {2016},
  month     = {jun},
  location  = {Newark, New Jersey, USA},
  pages     = {243--244},
  numpages  = {2},
  url       = {http://dx.doi.org/10.1145/2910896.2925452},
  doi       = {10.1145/2910896.2925452},
  isbn      = {978-1-4503-4229-2},
  publisher = {ACM},
  address   = {New York, NY, USA}
}