cyrusimap / cyrus-imapd

Cyrus IMAP is an email, contacts and calendar server
http://cyrusimap.org
Other
531 stars 146 forks source link

squatter and cyrus.conf: START | EVENTS #2234

Open dilyanpalauzov opened 6 years ago

dilyanpalauzov commented 6 years ago

https://cyrusimap.org/imap/developer/install-xapian.html suggests putting squatter in cyrus.conf:START.

https://cyrusimap.org/imap/reference/manpages/systemcommands/squatter.html suggests putting squatter in cyrus.conf:EVENTS.

The latter should state when to put squatter in START and when in EVENTS.

https://cyrusimap.org/imap/developer/install-xapian.html should state that convesations must be enabled in order to use search_xapian. In particular, running "squatter -R" doesn't make problems, but just running 'squatter' (to index the past) emits: "ERROR: conversations required but not enabled".

elliefm commented 6 years ago

Oh, well spotted, thanks :)

My understanding here is that you put squatter -R in START when you want new messages to be indexed (almost) immediately; or you put a non--R invocation into EVENTS when you want indexing to occur on some schedule.

But... https://cyrusimap.org/imap/reference/manpages/systemcommands/squatter.html also says,

Note

Incremental updates are very inefficient with the SQUAT search engine. If using SQUAT for large and active mailboxes, you should run squatter periodically as an EVENT in cyrus.conf(5).

@rsto Does this inefficiency affect the Xapian search engine, or only the Squat engine?

And https://cyrusimap.org/imap/developer/install-xapian.html says,

Xapian requires a running squatter(8) instance:

Does this simply mean that you need to arrange for squatter to run (either by START or EVENTS)? Or does it mean more specifically that Xapian indexes can only be built in -R mode?

elliefm commented 6 years ago

(I'll update the docs once I'm clear on my understanding.)

elliefm commented 6 years ago

If you're using multiple search tiers, you'll also need to periodically run squatter with the -t and -z arguments to compact them, so even if you were primarily indexing in rolling mode, you would also want some EVENT entries for this. I think multiple search tiers are only supported for the Xapian engine.

I've just re-read the old email that the install-xapian doc references, and it says:

We built a C API wrapper around the Xapian features we needed, and tried to run in production. Unfortunately, the IO hit from indexing new messages in real time was too much - our server couldn't cope. [edit: it goes on to describe setting up tiers so that real time indexing occur on tmpfs]

So it sounds like the rolling mode performance issue is a problem with the Xapian engine too, not just Squat. So I guess our recommendation would be something like:

@postilion this looks like another area that needs the de-defaultifying treatment. Did you already have Xapian doc updates in the works? I remember us talking about search tiers recently. Let's coordinate on this so we don't double up or cross wires!

postilion commented 6 years ago

@elliefm I'll try to poke at this tomorrow (my time) to see where it's at. The documentation on Xapian is woeful, but I am running our (small) server with it enabled in rolling mode. I am not using multiple tiers, however, as it was unclear to me just what the advantages would be, vis-a-vis the complexity trade-off.

elliefm commented 6 years ago

Great, thanks :)

postilion commented 6 years ago

@elliefm Some of the vagueness in language in the man page is due to its history. Originally there were just squat indexes. Then Sphinx was added and the page was rewritten to support a universe of two index styles. Then Xapian was added, and Sphinx was deprecated, but the tech writers (myself included) didn't appreciate that Sphinx was totally removed. In fact, that Sphinx had never got further than an in-house Fastmail effort.

At the time I undertook to re-write the manpage, in February/March of 2017, a discussion with Bron led to my excising all remaining references to Sphinx, and adding the notes about where/how to run squatter, depending upon which search engine is used.

I think it will be fairly straight forward to clean up squatter(8) to make more sense, but this leaves a larger task, which is to properly document:

Unfortunately, I have a very small installation here, and it doesn't really tickle any performance issues. My clients have significantly larger installations, but I don't think I can risk wading too far in to search engine testing with them until I understand all of the ramifications of it. And, all my client installations are still on 2.5.X at this time.

So, I will touch up the squatter(8) page, but leave the larger project wanting for now. -nic

PS -- One for the disambiguation department is the unfortunate fact that the now-generic indexing tool is named after the old indexing engine. This leads to even more confusion.

rsto commented 6 years ago

if you want immediate index updates: use squatter -R in START, but make sure your defaultsearchtier is on fast storage, and then use compact jobs running from EVENTS (or cron, etc) to periodically move indexes to larger/slower storage if you don't need immediate index updates: just run squatter from EVENTS

this summary by @elliefm sums it pretty well for Xapian

dilyanpalauzov commented 6 years ago

Does squatter+xapian require conversations? Running "squatter" on the command line says so, but running "squatter -R" using cyrus.conf doesn't make any problems on 3.0.

Is for xapian required to specify this explicitly, if the values are implicit:

defaultpartition: default partition-default: /var/cyrus/spool

In install-xapian.rst >after "Add this to imapd.conf5)" insert

sync_log: on sync_log_channels: squatter

to clarify, that this section is for the last two upper bullets (which I have overseen, thanks to the snippet).

rsto commented 6 years ago

Does squatter+xapian require conversations?

Yes, this is checked here: https://github.com/cyrusimap/cyrus-imapd/blob/cyrus-imapd-3.0/imap/search_xapian.c#L100

If your invocation of running squatter didn't end up here, there's something bogus.

dilyanpalauzov commented 6 years ago

Without "conversations: 1":

cyrus@host$ squatter 
cyrus-local/squatter[12636]: indexing mailboxes
cyrus-local/squatter[12636]: ERROR: conversations required but not enabled
cyrus-local/squatter[12636]: done indexing mailboxes

but

cyrus@host:/git/cyrus-imapd$ squatter -R -d
^Ccyrus-local/squatter[12735]: graceful shutdown

works until Ctrl-C is pressed. The latter enters neither check_config() nor run().

I am talking about 3.0.

elliefm commented 6 years ago
cyrus@host:/git/cyrus-imapd$ squatter -R -d
^Ccyrus-local/squatter[12735]: graceful shutdown

works until Ctrl-C is pressed. The latter enters neither check_config() nor run().

What happens if a message is delivered while that squatter -R -d is running? I'm wondering if maybe it doesn't notice the missing conversations until it gets activity and tries to do something. I haven't checked the code.

dilyanpalauzov commented 6 years ago

Ah, squatter -R crashes on my system with SIGSERV even with sync_log: on with this backtrace full:

#0  0x00007f160ebdf357 in search_update_mailbox (rx=0x0, mailbox=0x7f16085e6018, flags=flags@entry=5)
    at imap/search_engines.c:211
        r = 0
        r2 = <optimized out>
        was_partial = 0
        batch_size = <optimized out>
        batch = {
          count = 0, 
          alloc = 0, 
          data = 0x0
        }
        msg = <optimized out>
        iter = 0x769f30
#1  0x0000000000403b92 in index_one (name=name@entry=0x769f30 "user.X.Y.Z", 
    blocking=blocking@entry=0) at imap/squatter.c:292
        mbentry = 0x0
        mailbox = 0x7f16085e6018
        r = 0
        flags = 5
        extname = 0x769df0 "user/X/Y/Z"
#2  0x0000000000403341 in do_rolling (channel=0x4044d8 "squatter") at imap/squatter.c:735
        mboxname = 0x769f30 "user.X.Y.Z"
        folders = 0x769eb0
        slr = 0x769b80
        i = 0
        r = <optimized out>
        folders = <optimized out>
        slr = <optimized out>
        i = <optimized out>
        r = <optimized out>
        mboxname = <optimized out>
#3  main (argc=<optimized out>, argv=<optimized out>) at imap/squatter.c:1010
        opt = <optimized out>
        alt_config = 0x0
        extname = 0x769df0 "user/X/Y/Z"
#2  0x0000000000403341 in do_rolling (channel=0x4044d8 "squatter") at imap/squatter.c:735
        mboxname = 0x769f30 "user.X.Y.Z"
        folders = 0x769eb0
        slr = 0x769b80
        i = 0
        r = <optimized out>
        folders = <optimized out>
        slr = <optimized out>
        i = <optimized out>
        r = <optimized out>
        mboxname = <optimized out>
#3  main (argc=<optimized out>, argv=<optimized out>) at imap/squatter.c:1010
        opt = <optimized out>
        alt_config = 0x0
        r = <optimized out>
        mboxnames = {
          count = 0, 
          alloc = 0, 
          data = 0x0
        }
        query = 0x0
        background = 1
        channel = 0x4044d8 "squatter"
        synclogfile = 0x0
        init_flags = <optimized out>
        multi_folder = 0
        user_mode = 0
        compact_flags = <optimized out>
        fromfile = 0x0
        srctiers = 0x0
        desttier = 0x0
        mode = <optimized out>
elliefm commented 6 years ago

Thanks for the backtrace. I've found the issue, and am working on a patch

elliefm commented 6 years ago

The crash is fixed by 9ddabb5c1 (3.0) and 3bf7fc8d4 (master), but rolling mode still won't detect the misconfiguration until some sync input arrives. Working on that now

dilyanpalauzov commented 6 years ago

Without sync_log:on squatter -d -R does not print anything. With sync_log it prints

indexing mailbox user/...
Xapian commited 2 updates i 10 sec

Apart from this, terminating the master process does not terminate squatter, if it was started in START {}.

postilion commented 6 years ago

@dilyanpalauzov Without commenting on your first point, I can confirm that you're correct in your second point. The newer versions of Cyrus have added a new section to cyrus.conf(5), "DAEMON". As per the manpage for cyrus.conf(5):

START

This section lists the processes to run before any SERVICES are spawned. This section is typically used to initialize databases. Master itself will not startup until all tasks in START have either completed or backgrounded themselves, so put no blocking commands here.

DAEMON

This section lists long running daemons to start before any SERVICES are spawned. They will be shutdown when :cyrusman:master(8) is exiting.

So, the documentation for squatter(8) should be updated to reflect this. I'll try to take care of that shortly.

Thanks! -nic

postilion commented 6 years ago

@dilyanpalauzov In regards to the first point in your last comment, the squatter(8) manpage does explicitly state:

When using the -R rolling mode, you MUST enable sync_log operation in :cyrusman:imapd.conf(5) via the sync_log: on setting, and MUST define a sync_log channel via the sync_log_channels: setting.

So the documentation can't be clearer. However, the program should honour this requirement and exit with an error if sync_log is not set, or there is no sync_log_channel: defined. @elliefm Are you taking care of that?

@elliefm Also, as shown above, the squatter(8) manpage refers to the sync_log_channels setting, but the imapd.conf(5) manpage (derived from /lib/imapoptions) has only this to say about sync_log_channels:

{ "sync_log_channels", NULL, STRING } /* If specified, log all events to multiple log files in directories specified by each "channel". To run these log files, you need to pass the -n option to sync_client -r with the channel name. Use this for a mesh style replication layout - every machine replicating to every other machine. You can use "" (the two-character string U+22 U+22) to mean the default sync channel. */

I think this needs updating to reflect the more diverse uses to which sync_log_channels is now being put. I've traditionally stayed away from editing things outside of /docsrc, however, so leave this to others.

Thanks again! -nic

elliefm commented 6 years ago

For whatever it's worth, I'm not sure how "new" squatter's use of sync channels is, at least without diving into the history. It predates my involvement with Cyrus, anyway. But I'm agreed about updating the sync_log_channels description in the imapd.conf man page. I'll see if I can put something together today. My inclination is to just explain what the setting does, and let the various other sets of documentation (replication docs, backup docs, search indexing docs) provide specific motivations/examples.

The behaviour we want, whereby squatter -R exits immediately with an error if it detects bad config, is fairly easy to implement, but I'm not certain there's no other consequences. At the moment, squatter backgrounds itself before reading cyrus config. In order to exit cleanly with an error, we would need to do so before backgrounding, but we also need to have the cyrus config loaded. I think I can just make it background itself a little later, but I'm not sure what side effects there might be. I'll make the change but put it up as a pull request so it can get some eyes on it. It might not get backported to 3.0, I'm not sure yet.

postilion commented 6 years ago

@elliefm Thanks for your comments & consideration. Squatter's use of sync_log is new in 3.X. There was no need for it prior to rolling replication, which came with Xapain. However, this may have been in a Fastmail branch for some time before entering main line.

Cheers, -nic

elliefm commented 6 years ago

Oh really? I was sure I'd seen it as far back as 2.5, but memory can be deceiving! (For whatever it's worth, I don't really have much involvement with FastMail's internal branches. I can see them if I want to, but I mostly only look at or work on mainline.)

postilion commented 6 years ago

@elliefm Seems like it's been around forever, huh? :-) I just checked, added in the 3.0 series. squatter.c didn't have do_rolling() until 3.0.0-beta. There was no Xapian support until 3.0 (albeit there was Sphinx) so no purpose for rolling, since SQUAT is rubbish at it.

Cheers, -nic

elliefm commented 6 years ago

Yeah, I just checked the same thing. The commits are from 2012 but must've been fm-only until after 2.5 became its own branch (which was bang on the time I started)

Curiously, the 2.5 docs on the website mention the -R option (which was the first thing I checked), but the 2.5 docs are their own special kind of mess so khlkahdf

elliefm commented 6 years ago

@dilyanpalauzov I think this pair of commits should fix up squatter so that it complains about missing conversations immediately: https://github.com/cyrusimap/cyrus-imapd/compare/cyrus-imapd-3.0...elliefm:v30/2234-squatter-dont-roll-if-bad-config

I don't have an environment where I can test this properly at the moment, so I'd appreciate any feedback.

Still need to look into having it complain about missing sync log settings

dilyanpalauzov commented 6 years ago

Under the DAEMON section of cyrus.conf squatter has to be run with -R -d, otherwise it forks in the background, the master process gets SIGCHLD, decreases ready_workers of the service, and starts it again, effectively achieving a fork bomb! In fact DAEMON processes under cyrus/master may not fork and if a DAEMON-process terminates cyrus/master restarts it.

imap/developer/install-xapian.rst needs also an update in regards of START -> DAEMON and in

Can you write some documentation for calalarmd? Currently it is useless, as nobody knows how to use it.

With the proposed patch for 3.0, after disabling conversations in imapd.conf and running squatter -R -d it exits with

cyrus-local/squatter[7497]: ERROR: conversations required but not enabled
fatal error: xapian: conversations required but not enabled

Is this the definite procedure to enable Xapian on a running system, in this order:

postilion commented 6 years ago

@dilyanpalauzov Please start a separate issue for calalarmd. It has nothing to do with this thread.

As for the "definite procedure" you've laid out, that looks pretty accurate to me. I haven't had a chance to play with compacting yet, and have no particular insight in to how this is supposed to work, what steps are required to deploy it, etc.

All I have to go on is the same documents you have access to.

Unfortunately, while Xapian is a potentially very significant add-on to Cyrus, there's not a lot of background information available on it. Fastmail use it, and Bron has laid out, in the emails cited in the documentation, how they do so, but there's no more generalized information. I have got Xapian up an running on our server, but it's very small, so not a great example. For example, we are not using compaction at all, but I have no idea whether that's a problem, and no way to judge the matter.

I do hope to learn more about it, but until I've had a chance to do more work with Xapian, I cannot very well write more documentation for it.

If you develop any further information or insight on it, please feel free to share.

Meanwhile, it sounds like the patch solves the problems this issue first raised, as long as we add a note that "-d" must be used when starting squatter in DAEMON section. I will add that to squatter(8) manpage when I get into my office.

Cheers, -nic

elliefm commented 6 years ago

Under the DAEMON section of cyrus.conf squatter has to be run with -R -d, otherwise it forks in the background, the master process gets SIGCHLD, decreases ready_workers of the service, and starts it again, effectively achieving a fork bomb! In fact DAEMON processes under cyrus/master may not fork and if a DAEMON-process terminates cyrus/master restarts it.

Oop, that's a bug. idled has a way to cope with this, I'll figure it out and make sure squatter handles it too. Thanks for testing!

elliefm commented 6 years ago

Okay, there's a new commit on that branch that should make squatter -R work correctly from the DAEMON section -- can you please confirm? https://github.com/cyrusimap/cyrus-imapd/compare/cyrus-imapd-3.0...elliefm:v30/2234-squatter-dont-roll-if-bad-config

Thanks

elliefm commented 6 years ago

My understanding of compaction is that it moves indexes from the source tier to the destination tier. If you only have a single tier, then you do not need to schedule a compact job, because nothing needs to be moved around.

The reason you may want multiple tiers is to segregate storage I/O load. Real-time/rolling indexing performs many disk writes, so you want this tier to be on storage that's very fast to write to. However, because messages are immutable, once they're indexed, that index won't change. So for long term storage of indexes, you only need good read performance, and even then only when a search is underway. So you set up two (or more) tiers, each with suitable underlying storage, and then you use squatter in compact mode to move the indexes from the real time tier down to the archival tier.

But, I do not know how the -F option fits in if you only have one tier.

dilyanpalauzov commented 6 years ago

With the proposed patch the fork bomb is gone.

For calarmd there is #2127.

squatter -h doesn't mention -Z and description for -U differs from squatter.rst.

In docsrc/imap/developer/install-xapian.rst and doc/README.xapian: START needs to be replaced with DAEMON.

In imap/reference/manpages/configs/cyrus.conf.rst: mention for DAEMON, that master ensures that the process is running and if the process dies or forks, master starts a new process;

Concerning docsrc/imap/reference/manpages/systemcommands/squatter.rst:

Must defaultpartition be defined for Xapian search to work, if there is only one partition?

Do I understand correctly, that each message in the xapian index can be stored in compacted or an uncompacted form?

The comments above say:

And then:

About rolling mode and immediate index updates: if all storages are equally fast, does squatter need to be run from EVENTS to compact?

Is there a cyrus.conf showing how to call "squatter -R" as daemon and simultaneously as EVENTS/compacting service?

Does squatter have to be run from EVENTS to filter expunged reports?

What does reindexing mean?

Is running 'squatter' several times in a row idempotent, even on a running system where squatter -R is running?

If there is only one tear, that is also rolling, are the reads blocked during the write operation?

How can I test that the xapian index is used when I search?

I guess in imap/squatter.c: the comment never returns and break; can be removed, if do_rolling is annotated with __attribute__((noreturn)).

     case ROLLING:
-        if (background)
+        if (background && !getenv("CYRUS_ISDAEMON"))
             become_daemon();
         do_rolling(channel);
         /* never returns */
         break;
hagedose commented 6 years ago

Sorry I'm late, but I wanted to make the observation that some of the discussion here does not reflect what I'm seeing. I'm running 3.0.5 and I put "squatter -R" in the START section of cyrus.conf. Contrary to @dilyanpalauzov's experience it is terminated when the Cyrus service is shut down:

$ ps auxww|grep squat|grep -v grep
cyrus    51148  4.1  2.6 154580 49404 ?        S    16:36   2:09 squatter -R
$ systemctl stop cyrus-imapd
$ ps auxww|grep squat|grep -v grep
$

The same when I just kill master:

$ ps auxww|grep squat|grep -v grep
cyrus    57759  0.0  0.1 108440  2148 ?        S    17:38   0:00 squatter -R
$ ps auxww|grep master|grep -v grep
cyrus    57754  0.0  0.1  80596  2844 ?        Ss   17:38   0:00 /usr/lib/cyrus-imapd/cyrus-master
$ kill 57754
$ ps auxww|grep master|grep -v grep
$ ps auxww|grep squat|grep -v grep
$

I wonder what the difference in our setups might be that would explain that.

elliefm commented 6 years ago

My recollection is that master sends a signal to its process group at shutdown. If the squatter process is in the same process group, it will receive this signal too. I think the difference comes down to how master is started (as root? as cyrus user? from an interactive session? rc script? etc) but I don't recall the details. There's discussion kicking around about idled often presenting a similar behaviour, which I think is what the DAEMON section of cyrus.conf might originally have been conceived to solve.

postilion commented 6 years ago

@dilyanpalauzov shortly I am addressing many of your concerns from the comment from 9 February 2017. You've raised many good points, and it will take us some time to respond to everything. Meanwhile, I want to get the most critical items taken care of.

Thanks for your thorough reporting. :-)

Cheers, -nic

postilion commented 6 years ago

In regards to "squatter.rst shall refer to install-xapian.rst" I think not, as references within man pages should be confined to other man pages, so the user can respond to them easily. I will add a reference in an HTML-only comment. -nic

postilion commented 6 years ago

@dilyanpalauzov : In regards "Must defaultpartition be defined for Xapian search to work, if there is only one partition?" Yes, by definition, defaultpartition is always defined, and it defaults (unfortunately) to "default." If this is not what you meant to ask, please rephrase.

Thanks again. -nic

elliefm commented 6 years ago

The DAEMON mode fork bomb issue is now fixed on both the 3.0 and master branches.

The master branch additionally has the changes such that squatter will exit immediately if it detects bad config, but I'm not comfortable changing the startup behaviour on the stable 3.0 branch.

dilyanpalauzov commented 6 years ago

When the master process is started as daemon it calls setsid(), creating a session that terminates all processes, started by master upon master termination. When master -D is called, there is no session for the START, so that on termination of master START-processes are not terminated. Apparently while testing I started master -D and found out, that squatter run under START was not terminated. That said cyrus.conf.rst shall state that sometimes START terminates the processes started by master and sometimes not.

What speaks against calling always setsid() and having a separate session, except that when master is short-lived it can terminate other processes, that may have not finished.

A previous comment says that defaultpartition: always defaults to default, but according to imapd.conf defaultpartition does not default to "default" but to <none>. Anyway, I think install-xapian shall be rewriten not to use defaultpartition, if this option is not necessary for running Xapian.

"For all modes, the -S option may be specified" means that -S can always be used, but the first synopsis (-v -Z) doesn't match on this.

postilion commented 6 years ago

@dilyanpalauzov in re: defaultpartition; You are correct!! My apologies. What I describe was the default behaviour up until 2009, but was then changed to the current form. I've been using Cyrus since 1997 and never noticed the change. This is quite embarrassing. :-0

But, regardless of that, the reason defaultpartition is included in the install-xapian documentation is to show how the various partition-related settings interact with each other. And also because up until a year ago, it was mandatory to set both defaultpartion and partition-name (where is the string to which defaultpartition is set) when using Xapian.

While it is no longer mandatory to set both defaultpartiton and partition-<default> for Xapian, it's still a useful illustration of how the various settings interact. The various pages of documentation strive to provide useful examples of the various settings, and where those settings interact, we strive to make that clear.

Since many people will be starting with whatever settings their distributions provide for the basic imapd.conf(5) and cyrus.conf(5) files, and those files still tend to use defaultpartition: default and partition-default: /var/spool/imap (or similar) that's what we often use in our examples.

I will note that as part of the effort to disambiguate the use of the word "default" as both a directive name (i.e. defaultpartition) and a setting's value, in PR #2224 I started to clean this sort of thing up. If you are curious, the documentation in imap/admin/locations/mailspool.rst sets out an example, and then other pages expand upon that, such as imap/admin/locations/archive-partitions.rst and imap/admin/locations/searchtiers.rst

In re: your comment about the missing -S in the first synopsis of squatter.rst; thanks for the observation. This will be fixed in the next commit. More of your notes, above, will also be addressed in future.

Thanks, again, for your notes. -nic