Open lizfischer opened 11 months ago
I see that this is not an issue in the new version! apologies
@lizfischer Actually that was going to be my first question, what version. Thunderbird has and still does have some total inconsistent From behavior. Up to 102 IETNG just copied out mbox files as is. V14.0.0 builds an mbox within IETNG. I now create the generally accepted format UNLESS a From hdr exists, I may try to change that. Right now I have bigger problems trying to handle mbox files over 4GB with an api that cannot work over 4GB! @cleidigh
I was working with the older version for 102. Just installed the new version and the dates align with my expectation now. I am having issues with this account-level export not capturing all folders on the new version, unfortunately:
Edit: Also, I can't tell if there's an issue for this already & maybe I should open a new one, but it would be awesome if account-level export could add the .mbox extension to the files it creates. I've been doing it manually to make it interpretable to my mbox viewer
Edit 2: I see the structure issue flagged in https://github.com/thundernest/import-export-tools-ng/issues/432#issuecomment-1747365942 ! đ
@lizfischer The account export is partially messed up, I listed that in the release post. It doesn't have the container, but the folders in the sbd should export correctly. What are you seeing? I have to do work to deal with the mbox extension for structured exports and imports. Exports are easy, imports not as easy. This will come later. @cleidigh
In the screenshot I posted above, I expected that all the folders shown in Thunderbird (Inbox, Drafts, Sent, Nesting folder child 1, etc.) would have corresponding files in the sbd folder. I made a new TB profile & re-synced and that fixed the issue, though. Must have been some quirk with upgrading from 102 to 115. I'm seeing all of them now (compare below with previous screenshot)
Adding the .mbox extension manually used to work to make the files viewable in e.g. PST Viewer Pro, but that doesn't seem to be the case anymore
@lizfischer I don't know why the viewer wont work, the files are the same with the exception of the more standard From. @cleidigh
Yeah, seems like a parsing issue on their end. Weird that it like the worse version of the From line đ Anyway, thanks for your help!! And thanks for your work on this plugin, it's great. I'm testing it for use in the acquisition of donor's email by the manuscripts division of a library
Ah, I figured it out--the viewer isn't expecting it to be "From - foo@bar.com Date Time Timezone". It expects something adhering more closely to the RFC spec for MBOX:
Each message in the mbox database MUST be immediately preceded by a single separator line, which MUST conform to the following syntax:
- The exact character sequence of âFromâ;
- a single Space character (0x20);
- the email address of the message sender (as obtained from the message envelope or other authoritative source), conformant with the âaddr-specâ syntax from RFC 2822;
- a single Space character;
- a timestamp indicating the UTC date and time when the message was originally received, conformant with the syntax of the traditional UNIX âctimeâ output sans timezone (note that the use of UTC precludes the need for a timezone indicator);
- an end-of-line marker.
Removing the dash, extra spaces, and timezone from an IETNG-exported file fixes the issue. For example, changing From - foo@bar.com Tue Sep 26 2023 07:33:16 GMT-0700
to From foo@bar.com Tue Sep 26 2023 14:33:16
Not sure who is more correct here, IETNG in its output or the viewer in its insistence on format.
Edit: fwiw, .mboxes generated by Google Takeout use this: From 1772327484425937763@xxx Mon Jul 24 18:26:42 +0000 2023
and those generated by Emailchemy use: From - Thu Oct 05 08:52:05 2023
, where (I'd guess) the dash is standing in for a missing email address
@lizfischer I think that the converter is ridiculous in its requirement for a non functional separator. I have seen many more formats than you mentioned. I don't have an issue with dropping the dash as this is an artifact from Thunderbird and it's not required on the import side. I don't like dropping the timezone though. @cleidigh
From foo@bar.com Tue Sep 26 2023 07:33:16-0700
would be closer to the spec & works in the viewers I've been testing. I don't think you have to drop the timezone, but I do think reformatting to be closer inline with the MBOX file format specifications is important. So in addition to a slight change to timezone presentation, dropping the dash and duplicate spaces.
@lizfischer I agree this is the "smart" thing to do especially since I am abandoning the non-conformant TB artifacts. I am coordinating with the Thunderbird mbox developer so hopefully we can get on the same page. So long as TB is tolerant, which it should be, I can do the ctime format. @cleidigh
@cleidigh Thanks, I'll check it out
Hi @cleidighâI installed the preview you sent, v14.0.1-b1-fdt1, & am still getting the From lines like From - no-reply@cc.yahoo-inc.com Mon Oct 16 2023 16:18:56 GMT-0700
rather than From no-reply@cc.yahoo-inc.com Mon Oct 16 2023 23:18:56-0700
@lizfischer First you should update to my first beta v14.0.1-b1 not that anything has changed. Look at the message source of one of the messages. I suspect this is Thunderbird incorrectly returning a From separator for an individual message. This is one of several Thunderbird issues I am discussing with the developers. I will somehow address in v14.0.1, the upcoming maintenance release. Can you id the source, imap, archive or Local folder? @cleidigh
Re-installed from the link in #466 and it's looking goodânot sure why different from the version the other day, but glad it's working! Thanks
@lizfischer Depends on msg, source and Thunderbird's mood. I am assuming TB bad and will deal with it in b2 which should push for tomorrow. All betas here:
@cleidigh
@lizfischer Try b2. Usually it's local folders that had issues. All replaced now. @cleidigh
Working well! My team sends their thanks, too
@lizfischer Great, one for the team... Are you doing mbox imports? I ask because I am close to releasing b2 with an important mbox import patch for an edge condition and want testers. @cleidigh
We're not doing any imports, just using the exports for archiving purposes.
Thx @cleidigh
@lizfischer Hope all is well... Wanted to reopen this discussion as I think we ended up with a less than standard date format. I have another user that is having trouble with the current format on Linux. I think with all the back and forth I missed the reference to Unix ctime format. This puts the year at the end and appears to be more common. I would assume this change would not affect you since I believe Google format has worked for you.
Any thoughts if I change? @cleidigh
This is a feature request. Currently, Thunderbird puts the date it synced the email in the "From - " line at the start of a message. This gets interpreted by many MBOX viewers as the date received, which technically in the MBOX spec it should be, but it's a bit misleading. Other Thunderbird export tools, like Emailchemy, fix this problem by changing the "From - " line during export to include the date listed in the "Date:" field of the message header. It would be a huge help if ImportExportToolsNG had an option for this when exporting folders.