jsuto / piler

Email archiving application
https://www.mailpiler.org/
Other
82 stars 9 forks source link

Manticore indexes not updating #182

Closed sebastianhelbig closed 2 months ago

sebastianhelbig commented 2 months ago

piler: 1.4.6-80dadac RT: 0

We switched from a local installation to a docker based installation some days ago and migrated all the existing data. Basically everything is working, except that newly added mails won't show up in the web ui. The counter for received mails goes up, I can see the metadata in the mysql database and I also see that the mails are added to the store.

The problem seems to be that the indexing cronjobs inside the piler container run, but they are not updating the indexes on the separate manticore container, but instead they are updating indexes in /var/piler/manticore inside the piler container.

How to fix this?

piler.conf:

archive_address=
archive_emails_not_having_message_id=1
archive_only_mydomains=0
backlog=20
check_for_client_timeout_interval=20
cipher_list=ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS
clamd_socket=/tmp/clamd
default_retention_days=36500
enable_chunking=0
enable_cjk=0
enable_folders=0
encrypt_messages=0
extra_to_field=X-Envelope-To:
extract_attachments=1
helper_timeout=20
hostid=<redacted>
listen_addr=0.0.0.0
listen_port=25
max_connections=64
max_message_size=50000000
max_requests_per_child=10000
max_smtp_memory=500000000
memcached_servers=127.0.0.1
memcached_to_db_interval=900
memcached_ttl=86400
min_message_size=100
min_word_len=1
mmap_dedup_test=0
mysql_connect_timeout=2
mysqlcharset=utf8mb4
mysqldb=piler
mysqlhost=db
mysqlpwd=cdjk3MDC9wan7fwc
mysqlsocket=
mysqluser=piler
number_of_worker_processes=2
pemfile=/etc/piler/piler.pem
pidfile=/var/run/piler/piler.pid
piler_header_field=X-piler-id:
process_rcpt_to_addresses=0
rtindex=0
security_header=
server_id=0
smtp_access_list=0
smtp_timeout=60
spam_header_line=
sphxdb=piler1
sphxhost=manticore
sphxport=9306
syslog_recipients=0
tls_enable=1
tls_min_version=TLSv1.2
tweak_sent_time_offset=0
update_counters_to_memcached=0
username=piler
verbosity=1
workdir=/var/piler/tmp

config-site.php:

<?php

$config['SITE_NAME'] = '<redacted>';
$config['SITE_URL'] = 'https://<redacted>/';

$config['SMTP_DOMAIN'] = '<redacted>';
$config['SMTP_FROMADDR'] = '<redacted>';
$config['ADMIN_EMAIL'] = '<redacted>';

$config['DECRYPT_BINARY'] = '/usr/bin/pilerget';
$config['DECRYPT_ATTACHMENT_BINARY'] = '/usr/bin/pileraget';

$config['PILER_BINARY'] = '/usr/sbin/piler';

$config['DB_HOSTNAME'] = 'db';
$config['DB_DATABASE'] = 'piler';
$config['DB_USERNAME'] = 'piler';
$config['DB_PASSWORD'] = '<redacted>';

$config['RT'] = 0;

$config['SPHINX_HOSTNAME'] = 'manticore:9306';
$config['SPHINX_HOSTNAME_READONLY'] = 'manticore:9307';
$config['SPHINX_MAIN_INDEX'] = 'main1';

$config['MEMCACHED_ENABLED'] = 1;
$memcached_server = ['memcached', 11211];

manticore.conf:

#!/usr/bin/php
<?php

define('LOCALSTATEDIR', '/var');
define('NGRAM_CONFIG', "        #ngram_len               = 1\n        #ngram_chars             = U+3000..U+2FA1F\n");

# See http://sphinxsearch.com/wiki/doku.php?id=charset_tables for more on the charset_table settings
# The following settings contains English and some Latin extras
define('CHARSET_TABLE', "0..9, english, _, \
                                  U+C1->U+E1, U+C4->U+E4, U+C5->U+E5, U+C6->U+E6, U+C9->U+E9, U+CD->U+ED, U+D3->U+F3, U+D6->U+F6, U+D8->U+F8, \
                                  U+DA->U+FA, U+DC->U+FC, U+0150->U+0151, U+0152->U+0153, U+0170->U+0171, U+01E2->U+E6, U+01E3->U+E6, U+01FC->U+E6, \
                                  U+01FD->U+E6, U+1D01->U+E6, U+1D02->U+E6, U+1D2D->U+E6, U+1D46->U+E6, \
                                  U+DF, U+E1, U+E4, U+E5, U+E6, U+E9, U+ED, U+00F3, U+F6, U+F8, U+FA, U+FC, U+0151, U+0153, U+0171\n");

define('SELECT_FIELDS', 'id, `from` as sender, `to` as rcpt, `fromdomain` as senderdomain, `todomain` as rcptdomain, `subject`, `arrived`, `sent`, `body`, `size`, `direction`, `folder`, `attachments`, `attachment_types`');

define('RT', 0);

?>

#
# minimal manticore configuration suited to piler
#

<?php if(RT == 0) { ?>

source base
{
   type = mysql
   sql_host = db
   sql_db = piler
   sql_user = piler
   sql_pass = cdjk3MDC9wan7fwc

   sql_attr_uint = size
   sql_attr_uint = sent
   sql_attr_uint = attachments
}

source delta : base
{
   sql_query_pre = SET NAMES utf8mb4
   sql_query_pre  = REPLACE INTO sph_counter SELECT 1, IFNULL(MAX(id), 0) FROM sph_index
   sql_query_post_index  = DELETE FROM sph_index WHERE id<=(SELECT max_doc_id FROM sph_counter WHERE counter_id=1)
   sql_query = SELECT <?php print SELECT_FIELDS; ?> FROM sph_index WHERE id <= (SELECT max_doc_id FROM sph_counter WHERE counter_id=1)

   sql_query_killlist = SELECT `id` FROM `metadata` WHERE `deleted`=1
}

source main1 : base
{
   sql_query_pre = SET NAMES utf8mb4
   sql_query = SELECT <?php print SELECT_FIELDS; ?> FROM sph_index WHERE id=-1
}

source main2 : base
{
   sql_query_pre = SET NAMES utf8mb4
   sql_query = SELECT <?php print SELECT_FIELDS; ?> FROM sph_index WHERE id=-1
}

source main3 : base
{
   sql_query_pre = SET NAMES utf8mb4
   sql_query = SELECT <?php print SELECT_FIELDS; ?> FROM sph_index WHERE id=-1
}

source main4 : base
{
   sql_query_pre = SET NAMES utf8mb4
   sql_query = SELECT <?php print SELECT_FIELDS; ?> FROM sph_index WHERE id=-1
}

source dailydelta : base
{
   sql_query_pre = SET NAMES utf8mb4
   sql_query = SELECT <?php print SELECT_FIELDS; ?> FROM sph_index WHERE id=-1
}

index main1
{
        source                  = main1
        path                    = <?php print LOCALSTATEDIR; ?>/piler/manticore/main1
        min_prefix_len          = 5
        min_word_len            = 1
        stored_fields           =
        charset_table           = <?php print CHARSET_TABLE; ?>
<?php print NGRAM_CONFIG; ?>
}

index main2
{
        source                  = main2
        path                    = <?php print LOCALSTATEDIR; ?>/piler/manticore/main2
        min_prefix_len          = 5
        min_word_len            = 1
        stored_fields           =
        charset_table           = <?php print CHARSET_TABLE; ?>
<?php print NGRAM_CONFIG; ?>
}

index main3
{
        source                  = main3
        path                    = <?php print LOCALSTATEDIR; ?>/piler/manticore/main3
        min_prefix_len          = 5
        min_word_len            = 1
        stored_fields           =
        charset_table           = <?php print CHARSET_TABLE; ?>
<?php print NGRAM_CONFIG; ?>
}

index main4
{
        source                  = main4
        path                    = <?php print LOCALSTATEDIR; ?>/piler/manticore/main4
        min_prefix_len          = 5
        min_word_len            = 1
        stored_fields           =
        charset_table           = <?php print CHARSET_TABLE; ?>
<?php print NGRAM_CONFIG; ?>
}

index dailydelta1
{
        source                  = dailydelta
        path                    = <?php print LOCALSTATEDIR; ?>/piler/manticore/dailydelta1
        min_prefix_len          = 5
        min_word_len            = 1
        stored_fields           =
        charset_table           = <?php print CHARSET_TABLE; ?>
<?php print NGRAM_CONFIG; ?>
}

index delta1
{
        source                  = delta
        path                    = <?php print LOCALSTATEDIR; ?>/piler/manticore/delta1
        min_prefix_len          = 5
        min_word_len            = 1
        stored_fields           =
        charset_table           = <?php print CHARSET_TABLE; ?>
        killlist_target         = main1:kl, main2:kl, main3:kl, main4:kl, dailydelta1:kl
<?php print NGRAM_CONFIG; ?>
}

source tag : base
{
   sql_query_pre = SET NAMES utf8mb4
   sql_query  = SELECT `_id`, `id` AS iid, `uid`, `tag` FROM `tag`

   sql_attr_uint = iid
   sql_attr_uint = uid
}

source note : base
{
   sql_query_pre = SET NAMES utf8mb4
   sql_query  = SELECT `_id`, `id` AS iid, `uid`, `note` FROM `note`

   sql_attr_uint = iid
   sql_attr_uint = uid
}

index tag1
{
        source                  = tag
        path                    = <?php print LOCALSTATEDIR; ?>/piler/manticore/tag1
        min_prefix_len          = 5
        min_word_len            = 1
        charset_table           = <?php print CHARSET_TABLE; ?>
        stored_fields           =
<?php print NGRAM_CONFIG; ?>
}

index note1
{
        source                  = note
        path                    = <?php print LOCALSTATEDIR; ?>/piler/manticore/note1
        min_prefix_len          = 5
        min_word_len            = 1
        charset_table           = <?php print CHARSET_TABLE; ?>
        stored_fields           =
<?php print NGRAM_CONFIG; ?>
}

<?php } else { ?>

index piler1
{
    type = rt
    path = /var/piler/manticore/piler1
    rt_mem_limit = 512M
    stored_fields =
    min_word_len = 1
    min_prefix_len = 5
    charset_table  = <?php print CHARSET_TABLE; ?>
    # See https://manual.manticoresearch.com/Creating_an_index/Data_types#Row-wise-and-columnar-attribute-storages
    # if you want to enable columnar storage
    # columnar_attrs = *
    rt_field = sender
    rt_field = rcpt
    rt_field = senderdomain
    rt_field = rcptdomain
    rt_field = subject
    rt_field = body
    rt_field = attachment_types
    rt_attr_bigint = arrived
    rt_attr_bigint = sent
    rt_attr_uint = size
    rt_attr_uint = direction
    rt_attr_uint = folder
    rt_attr_uint = attachments
}

index tag1
{
    type = rt
    path = /var/piler/manticore/tag1
    rt_mem_limit = 16M
    stored_fields = tag
    min_word_len = 2
    min_prefix_len = 5
    charset_table  = <?php print CHARSET_TABLE; ?>
    rt_field = tag
    rt_attr_bigint = mid
    rt_attr_uint = uid
}

index note1
{
    type = rt
    path = /var/piler/manticore/note1
    rt_mem_limit = 16M
    stored_fields = note
    min_word_len = 2
    min_prefix_len = 5
    charset_table  = <?php print CHARSET_TABLE; ?>
    rt_field = note
    rt_attr_bigint = mid
    rt_attr_uint = uid
}

index audit1
{
    type = rt
    path = /var/piler/manticore/audit1
    rt_mem_limit = 16M
    stored_fields = *
    min_word_len = 2
    min_prefix_len = 5
    charset_table  = <?php print CHARSET_TABLE; ?>
    rt_field = email
    rt_field = ipaddr
    rt_field = description
    rt_attr_bigint = ts
    rt_attr_bigint = meta_id
    rt_attr_uint = action
}

<?php } ?>

searchd
{
        listen                  = 127.0.0.1:9312
        listen                  = 127.0.0.1:9306:mysql
        listen                  = 127.0.0.1:9307:mysql_readonly
        log                     = /var/piler/manticore/manticore.log
        binlog_max_log_size     = 256M
        binlog_path             = /var/piler/manticore
        binlog_flush            = 2
        query_log               = /var/piler/manticore/query.log
        network_timeout         = 5
        pid_file                = /var/run/piler/searchd.pid
        seamless_rotate         = 1
        preopen_tables          = 1
        unlink_old              = 1
        thread_stack            = 512k
        # https://manticoresearch.com/blog/manticoresearch-buddy-intro/
        # Give a value to the buddy_path variable to enable manticore-buddy
        # Be sure that user piler has read/write access to it
        buddy_path              =
<?php if(RT) { ?>
        rt_flush_period         = 300
<?php } ?>

}

docker-compose.yml:

services:
  db:
    container_name: <redacted>-db
    environment:
      - MARIADB_AUTO_UPGRADE=true
      - MARIADB_DATABASE=piler
      - MARIADB_ROOT_PASSWORD=<redacted>
      - MARIADB_PASSWORD=<redacted>
      - MARIADB_USER=piler
      - TZ=Europe/Berlin
    image: mariadb:11.5
    restart: unless-stopped
    volumes:
      - ./config/mysql.cnf:/etc/mysql/conf.d/piler.cnf:ro
      - ./mysql:/var/lib/mysql

  manticore:
    container_name: <redacted>-manticore
    deploy:
      resources:
        limits:
          memory: 512M
        reservations:
          memory: 512M
    expose:
      - "9306"
      - "9307"
      - "9312"
    image: manticoresearch/manticore:6.3.6
    restart: unless-stopped
    volumes:
      - ./config/manticore.conf:/etc/manticoresearch/manticore.conf
      - ./manticore:/var/lib/manticore

  memcached:
    command: -m 64
    container_name: <redacted>-memcached
    expose:
      - "11211"
    image: memcached:1.6-alpine
    restart: unless-stopped

  web:
    container_name: <redacted>-web
    depends_on:
      - db
      - manticore
      - memcached
    deploy:
      resources:
        limits:
          memory: 512M
        reservations:
          memory: 512M
    environment:
      - MANTICORE_HOSTNAME=manticore
      - MEMCACHED_HOSTNAME=memcached
      - MYSQL_DATABASE=piler
      - MYSQL_HOSTNAME=db
      - MYSQL_PASSWORD=<redacted>
      - MYSQL_USER=piler
      - PILER_HOSTNAME=<redacted>
      - RT=0
    expose:
      - "25"
      - "80"
    healthcheck:
      interval: "20s"
      retries: 3
      start_period: "15s"
      test: curl -s smtp://localhost/
      timeout: "3s"
    image: sutoj/piler:1.4.6
    init: true
    restart: unless-stopped
    volumes:
      - ./config/piler:/etc/piler
      - ./data/import:/var/piler/import
      - ./data/store:/var/piler/store
      - ./logs/mail.log:/var/log/mail.log
      - ./logs/syslog:/var/log/syslog
jsuto commented 2 months ago

The problem is exactly what you wrote. The preferred setup is to use real-time (rt) index using docker. If you insist on the legacy plain index settings, then disable the indexer.* jobs in piler's crontab in the piler container, and run them in the manticore container.

If you decided to switch to RT index, then see https://manual.manticoresearch.com/Data_creation_and_modification/Adding_data_from_external_storages/Adding_data_to_tables/Attaching_one_table_to_another#Attaching-table---general-syntax how to make it happen.

sebastianhelbig commented 2 months ago

Thanks for your fast answer. Since now already quite some mails are missing in the index, may it be easier to switch to a new empty manticore instance with RT=1 and let the index be rebuild as described here: https://www.mailpiler.org/recreate-the-index-data-files/ ?

jsuto commented 2 months ago

Correct. Except you don't need to run the indexer scripts any longer.