imapsync / imapsync

Imapsync is an IMAP transfers tool. The purpose of imapsync is to migrate IMAP accounts or to backup IMAP accounts. IMAP is one of the three current standard protocols to access mailboxes, the two others are POP3 and HTTP with webmails, webmails are often tied to an IMAP server. Upstream website is
http://imapsync.lamiral.info
Other
3.38k stars 472 forks source link

UTF7-IMAP problem? #478

Closed rmff closed 1 month ago

rmff commented 1 month ago

Hi,

I'm having trouble with folder names when migrating several accounts from an 'unknown' IMAP server to Carbonio v24.9 (Zimbra "fork"). Folders with diacritics are not correctly named at the destination.

For example:

Host1 folder   48/67 [INBOX.Projetos._Arquivo.Cart&APM-rio] = [INBOX.Projetos._Arquivo.Cartório] ...
Host2 folder   48/67 [Projetos/_Arquivo/Cart&APM-rio] = [Projetos/_Arquivo/Cartório] ...

This folder is created as "Cart&APM-rio" at the destination instead of the correct "Cartório." image

I tried using a regex unsuccessfully, like this: --regextrans2 "s,([áàâäãåæ])|([ÁÀÂÄÃÅÆ])|([ç])|([Ç])|([éèêë])|([ÉÈÊË])|([íìîï])|([ÍÌÎÏ])|([ñ])|([Ñ])|([óòôöõøœ])|([Ó#ÒÔÖÕØŒ])|([ß])|([úùûü])|([ÚÙÛÜ]),${1:+a:}${2:+A:}${3:+c:}${4:+C:}${5:+e:}${6:+E:}${7:+i:}${8:+I:}${9:+n:}${10:+N:}${11:+o:}${12:+O:}${13:+s:}${14:+u:}${15:+U:},g"

Is there a way to fix this using imapsync options?

My command: ./sync_loop_unix.sh --automap --compress1 --noreleasecheck --emailreport1 --emailreport2 --usecache --buffersize 8192000 --syncinternaldates --f1f2 'INBOX.N&AOM-o &AOk- spam'=INBOX --f1f2 'INBOX.&AMk- spam'=Junk --f1f2 'INBOX.!Lixo Eletr&APQ-nico'=Junk --f1f2 'INBOX.Itens Enviados'=Sent --f1f2 'INBOX.Lixeira'=Trash --f1f2 'INBOX.Rascunhos'=Drafts --regextrans2 "s,([áàâäãåæ])|([ÁÀÂÄÃÅÆ])|([ç])|([Ç])|([éèêë])|([ÉÈÊË])|([íìîï])|([ÍÌÎÏ])|([ñ])|([Ñ])|([óòôöõøœ])|([ÓÒÔÖÕØŒ])|([ß])|([úùûü])|([ÚÙÛÜ]),${1:+a:}${2:+A:}${3:+c:}${4:+C:}${5:+e:}${6:+E:}${7:+i:}${8:+I:}${9:+n:}${10:+N:}${11:+o:}${12:+O:}${13:+s:}${14:+u:}${15:+U:},g" --justfolders --dry

rmff commented 1 month ago

Going further, if I use --regextrans2 's/&APM-/ó/g', it works. However, this seems odd because IMAPSync should already be using UTF-7-IMAP for folder names, and in my understanding, this type of conversion should be redundant.

image

rmff commented 1 month ago

Very ugly solution, but worked.

Replace diacritic to no-diacritic letter: [áàâäãåæ] => a [ÁÀÂÄÃÅÆ] => A [ç] => c [Ç] => C [éèêë] => e [ÉÈÊË] => E [íìîï] => i [ÍÌÎÏ] => I [ñ] => n [Ñ] => N [óòôöõøœ] => o [ÓÒÔÖÕØŒ] => O [ß] => s [úùûü] => u [ÚÙÛÜ] => U

--regextrans2 's/&AOE-|&AOA-|&AOI-|&AOQ-|&AOM-|&AOU-|&AOY-/a/g' 
--regextrans2 's/&AME-|&AMA-|&AMI-|&AMQ-|&AMM-|&AMU-|&AMY-/A/g' 
--regextrans2 's/&AOc-/c/g' 
--regextrans2 's/&AMc-/C/g' 
--regextrans2 's/&AOk-|&AOg-|&AOo-|&AOs-/e/g' 
--regextrans2 's/&AMk-|&AMg-|&AMo-|&AMs-/E/g' 
--regextrans2 's/&AO0-|&AOw-|&AO4-|&AO8-/i/g' 
--regextrans2 's/&AM0-|&AMw-|&AM4-|&AM8-/I/g' 
--regextrans2 's/&APE-/n/g' 
--regextrans2 's/&ANE-/N/g' 
--regextrans2 's/&APM-|&API-|&APQ-|&APY-|&APU-|&APg-|&AVM-/o/g' 
--regextrans2 's/&ANM-|&ANI-|&ANQ-|&ANY-|&ANU-|&ANg-|&AVI-/O/g' 
--regextrans2 's/&AN8-/s/g' 
--regextrans2 's/&APo-|&APk-|&APs-|&APw-/u/g' 
--regextrans2 's/&ANo-|&ANk-|&ANs-|&ANw-/U/g'

Playground to convert other diacritics / special characters https://onlinephp.io/c/e54a3

gilleslamiral commented 1 month ago
rmff commented 1 month ago

Hi Gilles,

Re syncing folders without regextrans2 option and using Thunderbird as client, I can check that diacritics are correct and the problem is Carbonio.

I get in touch with Carbonio experts and they said that v24.9.0 is buggy with encodes/decodes on WEB interface, so it's not related with IMAPSync.

Best Regards.

gilleslamiral commented 1 month ago

Going further, if I use --regextrans2 's/&APM-/ó/g', it works.

It's very strange that it worked.

It will be a mess to migrate out of this mailbox later, good luck!

However, this seems odd because IMAPSync should already be using UTF-7-IMAP for folder names; in my understanding, this type of conversion should be redundant.

For the real stuff, Imapsync doesn't know what folder names are encoded into. It takes folder names as they are on the source and creates them on the destination without change but the prefix and separator.