TrivadisPF / platys-modern-data-platform

Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....
Apache License 2.0
72 stars 15 forks source link

Remove configbaker to init Dataverse solr instance #942

Closed gschmutz closed 2 weeks ago

gschmutz commented 2 weeks ago

The standard way for creating and initializing the Solr instance for dataverse is to use a separate docker instance.

  dataverse-solr-initializer:
    container_name: dataverse-solr-initializer
    hostname: dataverse-solr-initializer
    image: gdcc/configbaker:{{DATAVERSE_version}}
    command:
      - sh
      - -c
      - "fix-fs-perms.sh solr && cp -a /template/* /solr-template"
    volumes:
      - ./container-volume/dataverse/solr/data:/var/solr
      - ./container-volume/dataverse/solr/conf:/solr-template
    init: true
    restart: "no"

The challenge with this approach is that there is a need to volume map an external folder into both containers, with the benefit that a single docker-compose.yml holds all the info and there is nothing else needed.

We can remove the need for the configbaker image and replace the functionality using the /docker-entrypoint-initdb.d/ folder to provide init scripts. This approach matches the way the CKAN solr instance is initialized.

gschmutz commented 2 weeks ago

the solr instance has been changed to remove the commented lines (depends_on, volume mapping and command):

  dataverse-solr:
    image: solr:9.3.0
    container_name: dataverse-solr
    hostname: dataverse-solr
    labels:
      com.platys.name: "solr"
      com.platys.description: "Dataverse Platform SolR Instance"
#    depends_on:
#      - dataverse-solr-initializer
    environment:
      - DATAVERSE_CORE_NAME=collection1
    {%if use_timezone | default(false) %}
      - TZ={{use_timezone}}
    {% endif -%}   {#  use_timezone #}
    volumes:
      - ./data-transfer:/data-transfer
      - ./init/dataverse/solr:/docker-entrypoint-initdb.d/    
#      - ./container-volume/dataverse/solr/data:/var/solr
#      - ./container-volume/dataverse/solr/conf:/template
    {%if use_timezone | default(false) %}
      - "./etc/timezone:/etc/timezone:ro"
      - "./etc/localtime:/etc/localtime:ro"
    {% endif -%}   {#  use_timezone #}
    {%if logging_driver is defined and logging_driver and logging_driver in ('fluentd','loki','syslog','splunk') | default(false) %}
    <<: *logging
    {% endif -%}   {#  logging_driver is defined ... #}
#    command:
#      - "solr-precreate"
#      - "collection1"
#      - "/template"
    restart: {{container_restart_policy}}