Closed jsfan3 closed 3 months ago
Interesting. The string &DiAOMg4pDjU-66_xF8FF_&DhsOAQ4qDgQ-
is decomposed into:
&DiAOMg4pDjU-
(utf-7)66_xF8FF_
(regular text)&DhsOAQ4qDgQ-
(utf-7)I don't think _xF8FF_
has any special meaning in email protocols. It looks like an attempt at using a unicode character. 0xf8ff is in the "private use" unicode space, see https://en.wikipedia.org/wiki/Private_Use_Areas. If that is indeed intended, I would expect it to be UTF-7-encoded.
I found https://stackoverflow.com/questions/10423116/msexchange-url-encoding, it mentions exchange intends it to be an escape for a slash (no idea why you would want to have/allow a slash in a directory name...). Since I don't think this is standard behaviour, I don't think we should interpret these characters specially. Perhaps either exchange can be changed to juse treat slash as a path separator, but I doubt it. It's probably best to rename the folder manually after syncing.
I am synchronizing a large email archive from Exchange to mox, where the folders to be renamed manually would be in an archive containing hundreds of thousands of folders. An impossible job to do manually. I was not the one who added a slash to the names of some folders.
Also, I use mox to create an archive that is synchronized with Outlook from time to time, so I cannot edit anything manually, otherwise the same problem would occur the next time I synchronize.
I also did a search and did not find any documentation referring to a standard. I think it is a specific and undocumented behavior of Outlook.
Alternatively, it could be a non-standard and undocumented behavior of Davmail, which imapsync connects to in order to synchronize Outlook with mox.
rfc 9051 declares what the server should be doing
https://datatracker.ietf.org/doc/html/rfc9051#name-mailbox-naming
I am digging through those data types — utf7 is deprecated since IMAP4rev2 uses utf-8 quoted strings. rfc 9051 enumerates what’s wrong with utf7 that isn’t a Unicode standard — f8ff indicates the poster is in a utf-16 environment like Java or ECMAScript which is also kind of legacy since the world moved on to utf-8 and utf-32, like Go. utf-16 initially was intended to hold all characters but proved to have an insufficient amount of code points, and utf-8 proved to be more efficient since most characters are single-byte and it’s a superset of US ASCII
I think this problem here comes down to server configuration:
— a. the server may be able to disable mailbox hierarchy. It is not a required feature. This would probably fail at imap SELECT
— b. the hierarchy character, typically a single ascii character /, may be made configurable to one or more unicode characters. This would probably fail at imap SELECT
— c. do nothing
It seems here the root cause is that Microsoft allowed slashes, then had to convert those slashes into something that is not a slash and picked an obscure unicode character. In 2024, that is not net-unicode per rfc 9051
rfc 9051 gets around that by declaring the server can do whatever, as long as the server does the same whatever every time and that whatever works recursively
this will work for as long as there is no attempt to map a mox mailbox name back to exchange
The poster should probably process the mailbox name to ensure that: — a. it is valid unicode and valid net-unicode rfc 5198 and — b. get around the hierarchy character
In Go, strings and byte-slices may contain invalid Unicode and the replacement character where bad unicode was
I can imagine the F8FF is meant as a literal slash to indicate "this/or/that", and not meant as a hierarchy separator (directories).
Perhaps the mailboxes can be renamed both at Exchange and mox to e.g. "this or that", or perhaps use some other character to indicate the "or".
If the slash was meant as separator, perhaps the mailboxes can be changed by first creating a new "this" and then renaming the "this_xF8FF_that" to "this/that"? Also both in Exchange and mox. By doing it on both sides, they both get "fixed" and future synchronization shouldn't cause trouble (though may trigger a full resync of those directories).
In any case, this should probably be done with a little script/program that talks IMAP: list mailbox folders/directories with the xF8FF, and apply the changes.
If you "fix" this, this could still be an issue for other scripts (like imapsync).
Especially when the other server disallows RENAME (rfc4314 ACL not kx
)
https://www.rfc-editor.org/rfc/rfc4314.html#section-4
Say: i run imapsync to synchronize my mail between 2 servers.
Mox has this-that
, the other has this_xF8FF_that
.
imapsync will create this_xF8FF_that
and put all messages there.
Also /
doesn't have to be the separator, it could be anything (mostly a dot .
)
Closing, I don't think we can do anything in mox about this. We don't want to recognize special private uses like xF8FF, and it doesn't seem wise to interpret it as a slash anyway (affecting the hierarchy). Best solution is to change the folder names on the source machines so they no longer include a literal slash. If there are many folders, some scripting with an IMAP library is probably the best way forward.
Please take a look at how the following two IMAP folders appear in Outlook and mox webmail:
[Inbox/&DiAOMg4pDjU-66xF8FF&DhsOAQ4qDgQ-] [Inbox/&DiAOMg4pDjU-66xF8FF&DhsOAQ4qDgQ-/&DhsOAQ4qDgQ-]
Outlook:
mox:
Basically, "xF8FF" is the "/" character separator when used within a folder name.