noisebridge / infrastructure

The Noisebridge Infrastucture
GNU General Public License v3.0
27 stars 19 forks source link

Upgrade to Mailman 3 #68

Open SuperQ opened 6 years ago

SuperQ commented 6 years ago

It looks like mailman is being developed under a different package, mailman3, but it's not a simple upgrade.

http://docs.mailman3.org/en/latest/pre-installation-guide.html#how-can-i-upgrade-from-mailman-2-1-x

SuperQ commented 6 years ago

@patrickod What do you think about using a mailman 3 deployment to a new server as part of splitting up list service from the website?

marcidy commented 5 years ago

the upgrade does have a few potential issues, however, I do not believe we are sensitive to most of them. Here is a summary of the potential issues. I'll summarize the upgrade process in another comment.

Issues during migration

1) Links to archives will be broken.

1.a) Public Archives Keep HTML files generated for the archives and point users to that location. Archive are automatically migrated, however, the old links will break.

1.b) Inks to private list archives are not possible to keep. They are gated and without mm2 there is no password

I dont think is is necessary, we can just use the migrated archives. If necessary we need to manually redirect users to the old, static links. And nothing can be done for the private archives.

2) No bounce processing. Do we use/need this?

3) mbox defects can cause weird issues with imported messages, specifically missing heards, unparsable Date.
Headers may be corrected or ignored.

4) During the migration, if a line starts with "From" in the message body, the migration considers this a new message.
a script exists, $prefix/bin/cleanarch which resolves some but not all issues. when this issues occurs, message bodies will contain other messages. This typically occurs with older messages or spam.
If there are messages wtih "From" in the body at the beginning of a new line, we are at risk for messages bleeding together

Production Issues

5) Plain Text digest mode doesn't deal with multi-part mime messages, 'quoted-printable' or BASE64 encodings well. https://gitlab.com/mailman/mailman/issues/473

hyperkitty does not implement a scrubber for digest mode. A suggested workaround is to link to archived messages and delete any non-text/plain parts with a note and link to archived message. For HTML onyl messages this does not appear to work and the HTML bleeds into the message.

This is unlikely to impact us as we are text only (I believe that is the case unless there is a plan to switch).

6) Incoming LMTP messages are dropped https://gitlab.com/mailman/mailman/issues/416

Do we use LMTP? No solution / work-around yet.

7) custom Header and footer are dropped after migration. https://gitlab.com/mailman/mailman/issues/341

A manual step is required to restore the customizations. See github issue.

SuperQ commented 5 years ago

I don't think we care at all about links to the list archives. It's just a repo for crawlers to index.

Seems like most of the issues are not going to be a big problem for us.

marcidy commented 5 years ago

Adding migration plan from slack:

Option 1: deploy a clean new install on m6, make sure it all worked, then do a cut-over and copy the data to the new box

Option 2: We can also easily do a test install on m6, wipe and re-install clean, with the data on m3 copied over.

marcidy commented 5 years ago

Just so I'm clear, option 2 is to do a clean install of mailman 2 on m6, copy m3 data, then start testing the migration? I think I prefer this option, I would like data to test the migration obviously. It seems like it should run smoothly but we'll see. I'm going to dry run a few times on my local install but it has no data.

I think there's some privacy issues with the private lists. I assume you technically have access and therefore are trusted to look at them post-migration. I want to be sure they are handled correctly given they have at least one difference mentioned above.

SuperQ commented 5 years ago

Option 2 is wipe the whole machine and install mailman 3.

Basically I think the only way to do a good clean upgrade is to start with mailman 3 and import the data from mailman 2 into it. This way we can repeat the process by wipe/retry more easily.

Basically my idea is this:

marcidy commented 5 years ago

Got it, ok.

Based on the migration instructions, it looks like a good idea to check mboxes for the following:

If the Mailman 2 list does not predate Mailman 2.1, its LISTNAME.mbox file is probably in good shape, but all mailboxes should be checked for defects before importing.

Certain defects such as missing Message-ID: headers or missing or unparseable Date: headers will be corrected or ignored by the import process.

The one defect that will definitely cause problems is lines beginning with From in message bodies. These will be seen as the start of a new message.

There is a Mailman 2 script at $prefix/bin/cleanarch. That can identify and fix most such lines, but it is not perfect. Cases have been observed where a post includes in its body a copy of some other message including the From separator. This will normally occur only on an old list which includes spam messages or other email problems in its subject matter, but is something to be aware of.

So a trial run of the script mentioned would be good. $prefix/bin/cleanarch

marcidy commented 5 years ago

what are the next steps for this specifically? Anything I can do?

marcidy commented 5 years ago

@SuperQ ping

marcidy commented 5 years ago

can I have access to m3 to do the pls? I couldn't find a place in ansible that looked like it was for that.

kevr commented 4 years ago

Can I get a dump of the current mailman2 database, please?

I'd like to do this on my local machine across some chroots and VMs, no need for direct server access for now.