thunderbird / import-export-tools-ng

Import Export Tools that supports Thunderbird v68-v128
Other
280 stars 30 forks source link

How to import Apple Mail .mbox format? #521

Open Cubytus opened 7 months ago

Cubytus commented 7 months ago

Hi there,

I re-discovered this plugin while looking for a way to restore lost messages (a mistake I made while writing a message rule within Thunderbird). I have a third-party backup app that just copied the raw Apple Mail hierarchy, so I'm using that as a source.

Now, Apple Mail's .mbox format is only a folder containing many sub-folders (identified by a single digit), the final one always being "Messages", each with a max of 1000 .emlx files. Importing does work fine, but recreates all the intermediate, empty folders. Therefore it is very time-consuming to move all messages from the myriad of "Messages" folders back to the main folder. (See attached)

Captura de Pantalla 2024-01-11 a la(s) 14 45 47

So far, I haven't found a way to import all .emlx from a folder, and remove the intermediate folder hierarchy. Is that even possible? In the screen capture above, I selected "Acq-1" as the target folder for importation, and whatever the original hierarchy was, I'd expect only the content of "Messages" to be listed right under "Acq-1".

cleidigh commented 7 months ago

@Cubytus IETNG won't trim any file structure, it imports as it sees it. Can you just import the messages folder?

A user created a python script for Apple mboxes, however I don't know if it will do what you want. Make sure and backup, read the notes. https://github.com/thunderbird/import-export-tools-ng/wiki

@cleidigh

Cubytus commented 7 months ago

There is no single "Messages" folder, but a few to a handful deep down a given hierarchy, depending on how many messages are stored in a way that there are never more than 1000 messages in each, which is great for granularity and presumably, performance.

Importing a single "Messages" folder is of course possible and works as designed, but importing 10s of such folders sure is a pain.

I tried the python script, but it fails to run for some reason: $ python3.11 apple2mboxsbd.py "INBOX 2.mbox/" Traceback (most recent call last): File "/Users/cubytus/Downloads/AppZ-/ImportTool/apple2mboxsbd.py", line 1, in <module> (…) NameError: name 'false' is not defined. Did you mean: 'False'?

Therefore having the option to trim 2nd level folders and anything in between would be great

cleidigh commented 7 months ago

@Cubytus Without being sure since I cannot play with it, I think the script appears to do what you want. We could try and debug together? @cleidigh

Cubytus commented 6 months ago

@cleidigh Sure, should I produce a subset of the messages, not touching the original hierarchy? Or, how should we do that?

cleidigh commented 6 months ago

@Cubytus Apologies, I inadvertently dropped you. I was reminded by another user with the same issue.

Cubytus commented 6 months ago

Well my answer still stands :)

cleidigh commented 6 months ago

@Cubytus Ok I am roping in @johnstonesnow, we will proceed here.

First let's try to see if we hardcode the options if we get any further. I will change the script. @cleidigh

johnstonesnow commented 6 months ago

Hi, not sure what "roping in" means, but I fear it may mean you've misplaced confidence in me. :) I can BARELY USE GITHUB!! I don't understand it much at all! I am therefore I am not sure how I can possibly help clever folks like those mentioned above :D

johnstonesnow commented 6 months ago

PS, a few people have suggested it may just work to do this:

Export my Mac Mail to MBOX files. Move them to Linux file system, install Thunderbird and import those files.

Should I try? Or is it sure not to work in your experience?

(I have 40-50GB of data in hundreds of nested/subnested local folders in Mac Mail)

cleidigh commented 6 months ago

@johnstonesnow By roping I just meant bringing you here so we can do this in one place. Don't worry too much about github, you are here in the discussion, that is mostly what you need. If you need help just ask.

So I have an older mac, but I don't have any apple mail setup. It's my understanding that both the active mail and backups use a structure with subfolders that is convoluted, see above. As far as I understood you can export single mbox files, however you can't just export all your folders to a simple structured output. If we can fix the script, we should have what we need. @cleidigh

johnstonesnow commented 6 months ago

Thank you, i feel less intimidated now :)

In case I can help shed light, I can tell you exactly what happens when I export mail folders in Mac Mail, but one important point first...

I am running OSX 10.14.6 (Mojave). I am told that soon after this version, they changed the export process. So my use case won't match most others these days as I am running an OS version that has been unsupported for years now (I refused to upgrade for reasons I won't waste time going into, suffice to say 'privacy reasons').

That said, in my version of Mac Mail, I learned something by playing today.

I have tons of folders. So I just used ONE folder for my test export. I clicked on the top level folder (which contains 5 others underneath) to highlight it, then used Mailbox>Export Mailbox function.

The resulting output was a folder called "Archive - Local" Inside that folder was a folder named as the one I clicked on. So if i click on a local mail folder called "Tax" and export to a folder called "mail export" on my desktop, I get:

Mail Export folder in which is contained just ONE folder named "Tax". That looked good, but there's no other folders inside, and there should be.

I found the solution (which you probably already know about), which was to EXPAND the "Tax" folder in Mac Mail's sidebar, and expand any other sub nested folders so that all the folders with other folders inside were expanded, then I clicked on the top level "Tax" folder, and Shift-Clicked the bottom one in the tree (the last in the list of folders contained inside "Tax" folder).

Then when I had all those highlighted, and ran the export again, they all came in properly into my export folder, with directory structure looking correct.

No idea if that's of any use, but thought I'd mention it.

cleidigh commented 6 months ago

@johnstonesnow So what does the last export structure look like? This has subfolders to Tax? @cleidigh

johnstonesnow commented 6 months ago

Yes, it has the exact same folder/subfolder structure as shown in Mac Mail sidebar.

Maybe I can do this better....

I created a Test Folder in my sidebar under "On My Mac" (Local folders) and created some sub folders underneath. NOTE: I only created sub folders in 1 and 2, not 3, as you'll see. I also sent myself a load of test emails to drop one in each and every folder, just so there is some data in each folder. Rather than explain in words, here's what the Mac sidebar looks like with this little 'test tree':

Screenshot 2024-02-25 at 12 27 25

I clicked to highlight JUST THE TOP folder "Test Folder", then right clicked and exported mailbox (same command as Mailbox>ExportMailbox), screenshot of doing that:

Screenshot 2024-02-25 at 12 33 07

I opened the target folder for this export in Finder, to see the output. Here's what it shows:

Screenshot 2024-02-25 at 12 32 31

That was method 1 which is no good. (I assumed it would bring contenst of any folder highlighted for export, it doesn't, it only brings emails, not folders contained inside).

Method 2:

Clicking all down arrows in the tree of folders I want to export, so every sub folder is visible, then shift clicking to highlight from top to bottom:

(I showed some redacted other folders to show clearly which are highlighted)

Screenshot 2024-02-25 at 12 36 16

Screenshot 2024-02-25 at 12 38 14

Looking at the Finder output this time:

Screenshot 2024-02-25 at 12 43 25

Please excuse my very non-tech explanations, but here is what seems to be happening to me:

In the tree of folders chosen for export:

  1. Every folder which has mail in it (whether it contains folders or not), produces a FILE called "foldername.mbox", which contains two txt files, mbox.txt and table_of_contents.txt (actually I shouldn't add ".txt" as I can't see any extension, I just assumed it's a txt file based on it offering to open with TextEditor)

  2. If that folder has only mail inside, and NO FOLDERS, that's all you get.

  3. But if that folder also has sub folders inside, you also get a folder called "foldername" .

So if you look at the second column from the left, you can see Test Subfolder 1 and Test Subfolder 2 both have a folder and an mbox file. But TestSubfolder3 does not have a folder, just an mbox, because it doesn't contain any sub folders.

Hope that wasn't too painful and explains as best I can!

Cubytus commented 6 months ago

We get different layouts then. If I create an empty folder layout on the server, I get this on exporting: Captura de Pantalla 2024-02-25 a la(s) 12 57 48

Captura de Pantalla 2024-02-25 a la(s) 12 57 00 (Newly created layout)

Captura de Pantalla 2024-02-25 a la(s) 13 07 41 (Existing layout)

The mbox file at the end of the path appears to be a text file holding all the messages. Both layouts are very similar to what you had @johnstonesnow .

However, going to ~/Library/Mail/V6/[some_UUID_identified_mailbox] reveals a different folder hierarchy: Captura de Pantalla 2024-02-25 a la(s) 13 05 29 (Newly created layout)

Captura de Pantalla 2024-02-25 a la(s) 13 10 57 (Existing layout, first part) Captura de Pantalla 2024-02-25 a la(s) 13 11 35 (Existing layout, second part)

In this case, the messages exist individually

However, the script doesn't work in this case either: Lancelot:ImportTool cubytus$ ./apple2mboxsbd.py test2/INBOX/Acquaintances/Claudia ./apple2mboxsbd.py: line 1: payload:allShortcutsEnabled:false: command not found Lancelot:ImportTool cubytus$ ./apple2mboxsbd.py test2/INBOX/Acquaintances/Claudia.mbox/ ./apple2mboxsbd.py: line 1: payload:allShortcutsEnabled:false: command not found Lancelot:ImportTool cubytus$

cleidigh commented 6 months ago

@Cubytus We definitely need the mbox export. Unfortunately I am still not totally clear on the structures, but will get there. The script was made in 2021 so I don't know what osx version it was targeting. @cleidigh

Cubytus commented 6 months ago

@cleidigh That would have been Big Sur. However, it may be irrelevant as many users upgrade from one version to the following.

"We definitely need the mbox export." Not sure what you mean by that. If the script absolutely, positively needs previous export through Apple Mail, that may introduce an additional problem if the original computer goes kaput.

With a proper backup, one can only assume the raw layout will be available.

OTOH if you mean you'll need my file, that can be done but I'll have to check if the messages contain sensitive personal data or not.

cleidigh commented 6 months ago

@Cubytus What I meant was we need to import the structures with the mbox files NOT individual emlx files.

I think we have to work on the exports not the native raw structures. I'm pretty sure the script is for the former. Tomorrow I will spend some apple time with the script. @cleidigh

cleidigh commented 6 months ago

@Cubytus @johnstonesnow So I just got the script to run unchanged on Windows. It doesn't do the transformation, but I don't get any weid error. I think your script must be damaged?

So looking at the script it expects folders with .mbox foldername suffix. This doesn't match anything you guys export so I am baffled. Googling so far has not given me anything. It seems that the script was for a long ago time?? @cleidigh

cleidigh commented 6 months ago

@obar I hope all is well. I was wondering if you could give us some help with your script. It expects folders with the mbox extension, but this doesn't match what export structure people get. Did something change or is there any options we need? https://github.com/thunderbird/import-export-tools-ng/pull/114

Thanks @cleidigh

cleidigh commented 6 months ago

@Cubytus You have Ecto1a.mbox highlighted in one of your examples above. Is that a folder that contains two files? The mbox with no extension and the toc. The reason I ask is because the icon for Eco1a.mbox is nt a folder icon. If so I have misread the structure. @cleidigh

obar commented 6 months ago

It expects folders with the mbox extension, but this doesn't match what export structure people get. Did something change or is there any options we need?

I haven't dug into this in detail, but from a glance at a screenshot above I see folders with the .mbox suffix. It's possible that the export format has changed (I don't have a Mac to test with, but I do think that years back, my friend ran this script on his Mac). That said, I don't think the script is even running. See:

./apple2mboxsbd.py: line 1: payload:allShortcutsEnabled:false: command not found

An error on line 1 is never a good sign ;) @Cubytus do you have Python 3 installed? Maybe send us the output of this command in your terminal:

python3 --version
johnstonesnow commented 6 months ago

We get different layouts then. If I create an empty folder layout on the server,

I didn't do anything "on the server". I am ONLY working with local folders (i.e. "On My Mac" folders)

cleidigh commented 6 months ago

@obar @Cubytus I think the script is messed up on download. Perhaps copying the raw link? @cleidigh

obar commented 6 months ago

One can save just the script using the raw feature in github, here is that link: https://github.com/thunderbird/import-export-tools-ng/raw/master/utility-scripts/apple2mboxsbd.py

cleidigh commented 6 months ago

@Cubytus @johnstonesnow @obar Ok, after understanding that the x.mbox are folders despite finder using a different icon, I converted a setup structure on Windows just fine.

So we have to get @Cubytus and @johnstonesnow to get the script working and we should be good. @cleidigh

Cubytus commented 6 months ago

@cleidigh

You have Ecto1a.mbox highlighted in one of your examples above. Is that a folder that contains two files? The mbox with no extension and the toc.

Depends how I access the folder. I'll take the Claudia.mbox folder as an example since, being empty, Ecto1a.mbox doesn't represent what will be found. The native, server-side structure: Captura de Pantalla 2024-02-26 a la(s) 11 18 16

If I first do an export through Mail, then yes, I get both mbox and table_of_contents files, the former to store all messages, the latter to tell the email client the messages "coordinates". Captura de Pantalla 2024-02-26 a la(s) 11 27 01

If I access the raw structure as can reasonably be expected in a worst-case scenario (non-working Mac OS X), then no, there's only Info.plist + [some_UUID] folder below the .mbox folder. Then there's [some_UUID] > Data > numbered folder 1 > numbered folder 2 > Numbered folder 3 > (Numbered folder 4, etc.) > Messages.

There are at least 3 numbered folders before getting the "Messages" folders. The only constant is that the "Messages" folder never contains more than 999 messages. Captura de Pantalla 2024-02-26 a la(s) 11 30 24

In other words, for the sake of this script, all the folders between [some_UUID] and the various "Messages" folders aren't necessary, and the "Messages" folders themselves should be merged before further steps.

@obar I have two versions of python installed, 2.7.16 and 3.11.7. When python --version is called, I get Python 2.7.16. However, I can explicitly call python3.11 --version to get the exact sub-3.11 version. TBH I don't recall why I installed both, they probably came as dependencies for some piece of software, though unsure which ones. Given the 2.7 branch isn't updated anymore (please correct me!), I'd rather have the python command linked to the 3.11 version, and find which port is still using 2.7.16 although it's a minor issue.

@cleidigh

I think the script is messed up on download. Perhaps copying the raw link? It is! For some reason the downloaded script doesn't match the source code at all! (See there if interested)

Going back to first step with the now-correct script… When ran against an Apple Mail export, I get:

python3.11 apple2mboxsbd.py ../testClo/INBOX/Acquaintances/Claudia.mbox/
usage: apple2mboxsbd.py [-h] [--verbose] [--dry-run] IN
apple2mboxsbd.py: error: argument IN: Top level does not contain any .mbox folders. Are you sure this is an Apple Mail export? If you have run the script on this directory already, it has been converted.

Second try:

python3.11 apple2mboxsbd.py ../testClo/INBOX/Acquaintances/
Finished

So the first step is that, contrary to what I thought, the script doesn't expect a .mbox folder as input, but rather the folder that contains the .mbox folder. The error message is unclear.

As for the result, there's no difference between the mbox, extension-less folder and the Claudia folder generated by the script. diff ../testClo/INBOX/Acquaintances/Claudia ../testClo/INBOX/Acquaintances/Claudia.mbox/mbox

Is that the expected result?

The script doesn't work against the raw folder structure, though.

cleidigh commented 6 months ago

@Cubytus First backup the Aquaintences structure as the script modifies in place. Then run the script against the Aquaintences folder. You should end up with the transform to sbd subfolders. @cleidigh

Cubytus commented 6 months ago

@cleidigh That's indeed what I got. Captura de Pantalla 2024-02-26 a la(s) 12 48 42

Now the concern is that the script can't be used on the raw structure as will be found when trying to transfer existing messages from, say, a backup made by Mac OS X to platform-independant Thunderbird.

cleidigh commented 6 months ago

@Cubytus I think we have to deal with just the script working on the apple export. The raw structures can be loaded into a new mac and exported from there. @cleidigh

Cubytus commented 6 months ago

The raw structures can be loaded into a new mac

That's precisely what I expected would be an unnecessary step. If a user's Mac computer breaks or is stolen/lost, requiring an additional, considerable expense to export one's emails just doesn't make sense to me.

cleidigh commented 6 months ago

@Cubytus I totally agree on the implications. I just can't go there myself as I am overloaded. It would be a good contribution if you were to modify the script. @cleidigh

cleidigh commented 5 months ago

@Cubytus Were you able to import after the script worked? @cleidigh