Esri / geoportal-server-harvester

Metadata Harvester for Esri Geoportal Server
http://esri.github.io/geoportal-server/
Apache License 2.0
31 stars 24 forks source link

Harvester task execution returns 500 internal server error #134

Open scook12 opened 3 years ago

scook12 commented 3 years ago

Issue

When attempting to execute a task, the server returns 500 internal server error.

Error Message: In the UI: Error executing task: natres-harvest-server In the dev console: Unable to load rest/harvester/tasks/e748f210-5af0-48f0-8393-5f648819a9df/execute? status: 500

Task - this is accessible at /rest/harvester/tasks/e748f210-5af0-48f0-8393-5f648819a9df:

{
    "uuid": "e748f210-5af0-48f0-8393-5f648819a9df",
    "taskDefinition": {
        "name": "natres-harvest-server",
        "processor": null,
        "source": {
            "type": "AGS",
            "label": "natres-server",
            "properties": {
                "ags-host-url": "https://natres-ags.bd.esri.com/server/rest/",
                "cred-username": "gisadmin",
                "cred-password": "iJR7WX7TREsz5kCmFFjGCkcduaPgfnjYO74eMhlIiLs=RSUyNHJpMjg1Mw==",
                "ags-enable-layers": "false",
                "ags-emit-xml": "false",
                "ags-emit-json": "true"
            },
            "keywords": [],
            "ref": "c9965955-ca9b-45f6-9ee1-ce54c0a36a31"
        },
        "destinations": [],
        "keywords": [],
        "incremental": false,
        "ignoreRobotsTxt": false,
        "ref": "e748f210-5af0-48f0-8393-5f648819a9df"
    }
}

Expected

The task executes and harvests content or fails gracefully with feedback to the user on whatever caused the failure.

Environment

Provided for context only, the deployment is working fine otherwise for both the geoportal and harvester. I can login, add brokers, add tasks, etc just not execute them.

Only thing in the logs that looked potentially relevant was this warning:

WARNING [http-nio-8080-exec-3] org.springframework.web.servlet.handler.AbstractHandlerExceptionResolver.logException 
Resolved [org.springframework.http.converter.HttpMessageNotReadableException: 
Required request body is missing: public org.springframework.http.ResponseEntity<com.esri.geoportal.harvester.support.TaskResponse> com.esri.geoportal.harvester.rest.TaskController.addTask(com.esri.geoportal.harvester.api.defs.TaskDefinition)]

Versions: docker 20.10, docker-compose 1.27.4, adoptopenjdk 11, openj9, tomcat 9.0.41

Dockerfile

FROM adoptopenjdk:11-jre-openj9 as gpt-builder
ARG VERSION=2.6.4
ARG GPT_URL=https://github.com/Esri/geoportal-server-catalog/releases/download/v${VERSION}/geoportal-server-catalog-${VERSION}.zip
ARG HRV_URL=https://github.com/Esri/geoportal-server-harvester/releases/download/v${VERSION}/geoportal-server-harvester-${VERSION}.zip

RUN apt-get update -y -q --no-install-recommends \
  && apt-get install wget unzip

# catalog
RUN mkdir -p /usr/src/app/catalog
WORKDIR /usr/src/app/catalog
RUN wget ${GPT_URL}
RUN unzip geoportal-server-catalog-${VERSION}.zip 

# harvester
RUN mkdir -p /usr/src/app/harvester
WORKDIR /usr/src/app/harvester
RUN wget ${HRV_URL}
RUN unzip geoportal-server-harvester-${VERSION}.zip

# server
FROM tomcat:9.0.41-jdk11-adoptopenjdk-openj9

COPY config/tomcat-users.xml $CATALINA_HOME/conf
COPY --from=gpt-builder /usr/src/app/catalog/*.war $CATALINA_HOME/webapps/geoportal.war
COPY --from=gpt-builder /usr/src/app/harvester/*.war $CATALINA_HOME/webapps/harvester.war

docker-compose.yml

version: "3.5"

services:
  geoportal:
    build: gpt_stack/geoportal
    ports:
      - "8082:8080"
    hostname: geoportal
    environment:
      - es_cluster=es-geoportal
      - es_node=elasticsearch
      - gpt_authentication=authentication-simple.xml
    volumes:
      - gptharvester:/root
      - gptmetadata:/opt/tomcat/webapps/metadata
    networks:
      - datastudio

  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.5.0
    container_name: elasticsearch1
    hostname: elasticsearch
    ports:
      - "9200:9200"
      - "9300:9300"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - esdata1:/usr/share/elasticsearch/data
    environment:
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - bootstrap.memory_lock=true
      - discovery.type=single-node
      - cluster.name=es-geoportal
      - xpack.security.enabled=false
      - xpack.ml.enabled=false
    networks:
      - datastudio
networks:
  datastudio:
    external:
      name: datastudio

volumes:
  esdata1:
    driver: local
  gptharvester:
    driver: local
  gptmetadata:
    driver: local