jsuto / piler

Email archiving application
https://www.mailpiler.org/
Other
83 stars 9 forks source link

[BUG] Only last few emails are visible #179

Closed simonszu closed 2 months ago

simonszu commented 2 months ago

Describe the bug In the past days i have installed piler and imported my email history to it. It was working fine and i was able to browse my whole archive as an Auditor. As of today morning, Piler believes that only around 18 mails (the newest ones) are present in my archive, i am not able to search for older emails - the query simply returns 0 results. According to the statistics page which is viewable as an Admin, the mails are still there. Since i did not modify anything overnight, i am quite surprised by this behavior and am struggling to get these mails back to viewable.

To Reproduce Steps to reproduce the behavior:

  1. Log in to piler as an Auditor which is assigned to these mails.
  2. Only the couple latest emails are visible
  3. Search for older emails which were visible yesterday
  4. Search returns 0 results
  5. Log in as Master Admin
  6. received messages: 72.000

Expected behavior The messages are visible like they were yesterday

Screenshots Master Admin view:

Screenshot 2024-09-04 at 09 44 08

Auditor view:

Screenshot 2024-09-04 at 09 45 07

This is an installation only for myself, so all emails are from my account and should be visible to me.

Piler version:

piler 1.4.6-80dadac, Janos SUTO <sj@acts.hu>

Build Date: Sun Jul 21 04:49:18 UTC 2024
ldd version: ldd (Ubuntu GLIBC 2.39-0ubuntu8.1) 2.39
gcc version: gcc version 13.2.0 (Ubuntu 13.2.0-23ubuntu4)
OS: Linux 5386ff2d9f57 6.8.0-38-generic #38-Ubuntu SMP PREEMPT_DYNAMIC Fri Jun 7 15:25:01 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Configure command: ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var --with-database=mariadb
MySQL client library version: 3.3.10
Extractors: /usr/bin/pdftotext /usr/bin/catdoc /usr/bin/catppt /usr/bin/xls2csv /usr/bin/unrtf /usr/bin/tnef libzip

Additional context Installed via Docker.

Latest logs:

2024-09-04T07:28:00.266103+00:00 b992c39d3608 piler-smtp[81]: piler-smtp 1.4.6-80dadac starting
2024-09-04T07:28:00.275198+00:00 b992c39d3608 piler[83]: reloaded config: /etc/piler/piler.conf
2024-09-04T07:28:00.275958+00:00 b992c39d3608 piler[83]: piler 1.4.6-80dadac starting
2024-09-04T07:29:36.398758+00:00 b992c39d3608 piler-webui[33]: sphinx query: 'INSERT INTO audit1 (ts, email, action, ipaddr, meta_id, description) VALUES(?,?,?,?,?,?)' in 0.00 s, 0 hits, 0 total found
2024-09-04T07:29:45.197304+00:00 b992c39d3608 piler-webui[33]: sphinx query: 'INSERT INTO audit1 (ts, email, action, ipaddr, meta_id, description) VALUES(?,?,?,?,?,?)' in 0.00 s, 0 hits, 0 total found
2024-09-04T07:29:45.197927+00:00 b992c39d3608 piler-webui[33]: username=mail@simonszu.de, event='logged in', ipaddr=172.19.1.1
2024-09-04T07:29:45.315724+00:00 b992c39d3608 piler-webui[33]: sphinx query: 'INSERT INTO audit1 (ts, email, action, ipaddr, meta_id, description) VALUES(?,?,?,?,?,?)' in 0.00 s, 0 hits, 0 total found
2024-09-04T07:29:45.316369+00:00 b992c39d3608 piler-webui[33]: sphinx query: 'SELECT id FROM piler1 WHERE        MATCH('') ORDER BY `sent` DESC LIMIT 0,20 OPTION max_matches=1000' in 0.00 s, 18 hits, 18 total found
2024-09-04T07:29:45.317311+00:00 b992c39d3608 piler-webui[33]: sphinx query: 'SELECT mid, tag FROM tag1 WHERE uid=3 AND mid IN (76620,76619,76618,76617,76616,76614,76615,76613,76612,76611,76610,76609,76608,76607,76606,76605,76604,76603)' in 0.00 s, 0 hits, 0 total found
2024-09-04T07:29:45.317694+00:00 b992c39d3608 piler-webui[33]: sphinx query: 'SELECT mid, note FROM note1 WHERE uid=3 AND mid IN (76620,76619,76618,76617,76616,76614,76615,76613,76612,76611,76610,76609,76608,76607,76606,76605,76604,76603)' in 0.00 s, 0 hits, 0 total found
2024-09-04T07:29:45.323164+00:00 b992c39d3608 piler-webui[32]: sphinx query: 'INSERT INTO audit1 (ts, email, action, ipaddr, meta_id, description) VALUES(?,?,?,?,?,?)' in 0.00 s, 0 hits, 0 total found
2024-09-04T07:29:45.323866+00:00 b992c39d3608 piler-webui[32]: sphinx query: 'SELECT id FROM piler1 WHERE        MATCH('') ORDER BY `sent` DESC LIMIT 0,20 OPTION max_matches=1000' in 0.00 s, 18 hits, 18 total found
2024-09-04T07:29:45.324760+00:00 b992c39d3608 piler-webui[32]: sphinx query: 'SELECT mid, tag FROM tag1 WHERE uid=3 AND mid IN (76620,76619,76618,76617,76616,76614,76615,76613,76612,76611,76610,76609,76608,76607,76606,76605,76604,76603)' in 0.00 s, 0 hits, 0 total found
2024-09-04T07:29:45.325089+00:00 b992c39d3608 piler-webui[32]: sphinx query: 'SELECT mid, note FROM note1 WHERE uid=3 AND mid IN (76620,76619,76618,76617,76616,76614,76615,76613,76612,76611,76610,76609,76608,76607,76606,76605,76604,76603)' in 0.00 s, 0 hits, 0 total found
2024-09-04T07:31:00.904527+00:00 b992c39d3608 piler-smtp[82]: connected from 172.19.1.1:42022 on fd=6 (active connections: 1)
2024-09-04T07:31:01.097576+00:00 b992c39d3608 piler-smtp[82]: received: Z62YVSYOAUU8RBXY, from=caf_@simonszu=mailbox.org, size=45402, client=172.19.1.1, fd=6, fsync=2108
2024-09-04T07:31:01.111110+00:00 b992c39d3608 piler-smtp[82]: disconnected from 172.19.1.1 on fd=6, slot=0, reason=done (0 active connections)

Especially the line 2024-09-04T07:29:45.323866+00:00 b992c39d3608 piler-webui[32]: sphinx query: 'SELECT id FROM piler1 WHERE MATCH('') ORDER BYsentDESC LIMIT 0,20 OPTION max_matches=1000' in 0.00 s, 18 hits, 18 total found is interesting. Why does manticore only return 18 hits?

I am aware of the FAQ "I can't see any results, however the sphinx query in the maillog reports 0 hits, and >0 total found" which recommends to use "or even manticore search 6.x". I have manticore running via Docker as well, and i am using version 6.3.2.

I am also aware of the FAQ "I can see only today's emails in the archive and not any single previous emails." which is exactly my problem. However, the START="no" setting is default in manticore's docker image. I am currently trying to reindex manticore with reindex -a in the piler image, which apparently does something. However, i still want to prevent this problem happening in the future. Not sure how the FAQ applies to this problem when everything runs as a container.

jsuto commented 2 months ago

Check the piler1 index:

mysql -h0 -P9306
show index piler1 status;
simonszu commented 2 months ago

Yeah, it looks like manticore lost its index. When triggering reindex -a from the piler container, the index grows and more and more mails are visible in piler's WebUI. However, is there a way to not have manticore lose its index? The START=no setting is set in /etc/default/manticore in the manticore container. Is there anything else i could try? In worst case, a cronjob in the piler container which excutes reindex -a periodically?

jsuto commented 2 months ago

Yes. Put the manticore data dir to a persistent volume.

simonszu commented 2 months ago

Actually, i did. Excerpt from my ansible playbook to deploy piler:

- name: Start container for manticore
  docker_container:
    name: manticore
    image: manticoresearch/manticore:6.3.2
    restart_policy: always
    volumes:
      - "{{ docker_datadir }}/piler/manticore/data:/var/lib/manticore"
      - "{{ docker_datadir }}/piler/manticore/config/manticore.conf:/etc/manticoresearch/manticore.conf"
    labels:
      com.centurylinklabs.watchtower.scope: "regular"
    networks:
      - name: backend
  become: yes

Even if i did not put the manticore data dir to a persistent volume, the data should have stayed there until the manticore container would have been recreated, which it wasn't.

So, more likely an issue to open against the manticore project, you say?

jsuto commented 2 months ago

Not sure. Before doing so, I'd verify that the index configured in manticore.conf is actually on the data volume. Anyway, I don't think it's a piler issue.