archivematica / Issues

Issues repository for the Archivematica project
GNU Affero General Public License v3.0
16 stars 1 forks source link

Problem: FITS JVM memory usage is unsustainable #357

Open sevein opened 5 years ago

sevein commented 5 years ago

This issue replaces https://github.com/artefactual-labs/am-packbuild/issues/184.

Expected behaviour Our package should include sane defaults to avoid out-of-memory errors in standard environments.

Current behaviour See original report from @jpellman:

When running the "Characterize and extract metadata" microservice in Archivematica 1.7.0 we noticed that we were consistently running into out-of-memory errors on the host due to the Java metaspace growing to a point that all of our RAM was used up due to the fits/nailgun. I've reported this issue to the FITS folks, but you might want to have the FITS package perform a patch to incorporate the changes I suggested in the linked issue above.

@jpellman's original issue title suggests the following solution:

Patch fits-ngserver.sh in RPM/DEB to Use Sensible Java Metaspace Settings

He also filed: https://github.com/harvard-lts/fits/issues/177.

Steps to reproduce Watch FITS JVM memory usage while running some transfers through Archivematica. The default is to use FITS when the identification tool could not determine the format.

Your environment (version of Archivematica, OS version, etc) Archivematica v1.8.0


For Artefactual use: Please make sure these steps are taken before moving this issue from Review to Verified in Waffle:

mamedin commented 5 years ago

We added "-Xms1g -Xmx1g -XX:MaxMetaspaceSize=1g" to the /usr/bin/fits-ngserver.sh on am18xenial.qa server 3 weeks ago:

root@am18xenial:~# cat /usr/bin/fits-ngserver.sh 
#!/bin/bash
#
# This helper script launches a nailgun server with the FITS
# classpath, making it simple to launch a persistent JVM for FITS.
# 
# The one required parameter is the path to nailgun's jar; it can also be
# specified via the NAILGUN_JAR environment variable.

. "$(dirname $BASH_SOURCE)/fits-env.sh"

if [[ ! $NAILGUN_JAR ]] && [[ ! $1 ]]; then
    echo "Error: Path to Nailgun JAR must be specified!" >&2
    echo "Usage: fits-ngserver.sh [path-to-nailgun.jar]" >&2
    echo "The path may also be specified via the NAILGUN_JAR environment variable." >&2
    exit 64
else
    NAILGUN_JAR=$1
fi

cmd="java -Xms1g -Xmx1g -XX:MaxMetaspaceSize=1g -Dlog4j.configuration=file:\"$FITS_HOME\"/log4j.properties -classpath \"$APPCLASSPATH:$NAILGUN_JAR\" com.martiansoftware.nailgun.NGServer"

echo "You may now run FITS by typing: ng edu.harvard.hul.ois.fits.Fits [options]" >&2

eval "exec $cmd"

I'm testing on other 3 VMs with a large dataset (>8k files)

ross-spencer commented 5 years ago

I wonder if this is an appropriate ticket to also discuss the idea of making FITS and opt-in / opt-out microservice as raised by @ablwr in Slack. I could imagine a number of ways that this can be controlled to be flexible for users.