Betterbird / thunderbird-patches

Betterbird is a fork of Mozilla Thunderbird. Here are the patches that provide all the goodness.
Other
455 stars 20 forks source link

Avoid duplicates in Gloda search involving Gmail's "All Mail" folder #224

Closed Betterbird closed 7 months ago

Betterbird commented 9 months ago

OK so I ended up writing an add-on for this (my first one! Took me a few hours to figure everything out): https://github.com/mayerwin/thunderbird-hide-all-mail-duplicates

Would probably deserve to be merged to the core, but only if improved to only apply to Gmail and Google Workspace, for all languages as 'All Mail' might change based on the display language configured in Gmail), and maybe optimize the performance a bit when checking for duplicates (the aItems email collection could be projected once into a dictionary of emails indexed by messageId).

The code to patch is in https://github.com/mozilla/releases-comm-central/blob/master/mailnews/db/gloda/modules/GlodaSyntheticView.jsm and should be updated using the following logic (instead of monkey patching the prototype):

GlodaSyntheticView.prototype.reportResults = function(aItems) {
    try {
        // Filter out duplicate messages from aItems before calling the original function
        const filteredItems = aItems.filter(item => {
            const hdr = item.folderMessage;
            if (hdr) {
                // Determine if an item is a duplicate
                //console.log("Email13: " + hdr.messageId + " in " + hdr.folder.name);
                if (hdr.folder.name === "All Mail") {
                    // Check the rest of aItems for a message with the same messageId but a different folder
                    const duplicateItem  = aItems.find(otherItem => {
                        const otherHdr = otherItem.folderMessage;
                        return otherHdr &&
                               otherHdr.messageId === hdr.messageId &&
                               otherHdr.folder.name !== "All Mail";
                    });
                    // If it's a duplicate, filter it out by returning false, but make sure to update selectedMessage if necessary.
                    if (duplicateItem && hdr.messageId === this.selectedMessage.messageId) {
                        this.selectedMessage = duplicateItem.folderMessage;
                    }
                    return !duplicateItem ;
                }
            }
            return true;
        });

        // Call the original reportResults function with the filtered list of items
        GlodaSyntheticView.prototype.originalReportResultsGlobal88319.call(this, filteredItems);
    } catch (e) {
        console.error('An error occurred while filtering items:', e);
        return true;
    }
    //console.log(name);
    //Services.wm.getMostRecentWindow("mail:3pane").alert(name);
}

If you think it'd be a nice feature in BetterBird, feel free to add it once it matches your quality bar. Definitely must be a pain point for all Gmail/Google Workspace users.

However, a better fix would be to prevent downloading emails in All Mail in the first place if they are already present in another folder. But this sounds a bit more complex to achieve, as we can't know if the emails are present in another folder until we've checked all folders. The email headers should suffice as the message ID is identical so it should help with the performance, but still it isn't trivial to implement it optimally to not slow things down, and Thunderbird may not have ready logic to use to avoid loading some emails (and hiding them without deleting them) within a folder.

Originally posted by @mayerwin in https://github.com/Betterbird/thunderbird-patches/issues/215#issuecomment-1793544100

Betterbird commented 9 months ago

@mayerwin, please file an enhancement request at https://bugzilla.mozilla.org/.

As far as we've seen, the TB codebase only offers a check for a Gmail server, https://searchfox.org/comm-central/rev/a38a38a923c6b3ddef0187ec44d25de8033e8824/mailnews/imap/public/nsIImapIncomingServer.idl#51, not whether a folder is "All Mail". Someone needs to experiment how this behaves in a localised version of Gmail.

Betterbird commented 9 months ago

Not a whole lot of action here lately. I've change one Gmail account to a Spanish interface and "All Mail" is now called "Todos". So this will be a bit of a nightmare to do. I'll check how TB/BB detect "All Mail" and make it an archive.

Betterbird commented 9 months ago

You can probably recognise the "All Mail" folder with this code:

if (hdr.folder.server instanceof Ci.nsIImapIncomingServer &&
    hdr.folder.server.isGMailServer &&
    hdr.folder.flags & Ci.nsMsgFolderFlags.Archive) { ...

This will of course also match archive folders on a Gmail server which are not "All Mail".

mayerwin commented 8 months ago

Thanks for looking into this. I just asked ChatGPT and here's what it came up with: https://chat.openai.com/share/b3cc31de-47ab-4ea6-87bc-6ef95b813196

To detect the "All Mail" folder in Gmail using IMAP, especially in a way that works regardless of language since the folder name is translated, you can rely on the unique attributes or flags that are assigned to system folders in Gmail. Gmail uses a set of standard system flags for its special folders, which are consistent across different languages. These flags can be used to identify the "All Mail" folder.

Here's a general approach to identify the "All Mail" folder:

  1. Connect to Gmail via IMAP: Establish a connection to Gmail using IMAP. You'll need the user's credentials and the correct IMAP server settings for Gmail.

  2. List All Folders: Use the IMAP command LIST to retrieve a list of all folders in the user's Gmail account.

  3. Check for Folder Attributes: Each folder in the list will have associated attributes. For Gmail, the "All Mail" folder usually has a \All attribute (or a similar system flag).

  4. Identify the "All Mail" Folder: Look for the folder with the \All attribute. This is typically the "All Mail" folder in Gmail.

  5. Handle Multi-Language Support: Since you are identifying the folder based on attributes and not on the name, this method should work regardless of the language.

Here's an example in Python using the imaplib library. Note that error handling and authentication are not fully covered in this snippet:

import imaplib

# Connect to Gmail
imap = imaplib.IMAP4_SSL('imap.gmail.com')

# Log in to the account
imap.login('your_email@gmail.com', 'your_password')

# List all folders
typ, folders = imap.list()
for folder in folders:
    flags, delimiter, folder_name = parse_list_response(folder)

    # Check if the folder has the '\All' attribute
    if '\\All' in flags:
        print(f"'All Mail' folder is: {folder_name}")
        break

# Function to parse the LIST response
def parse_list_response(line):
    parts = line.split()
    flags = parts[0]
    delimiter = parts[1]
    folder_name = ' '.join(parts[2:])
    return flags, delimiter, folder_name

# Don't forget to log out
imap.logout()

This code will connect to the Gmail IMAP server, list all folders, and print out the name of the "All Mail" folder. Remember to replace 'your_email@gmail.com' and 'your_password' with the actual Gmail credentials. Also, ensure you handle exceptions and errors appropriately in your actual implementation.

Betterbird commented 8 months ago

Thanks for investigating this a bit further.

The TB code retrieves the "Gmail All Mail"-ness here: https://searchfox.org/comm-central/rev/bf3f924c4586bf425662b40811e9ec47329f92d9/mailnews/imap/src/nsImapServerResponseParser.cpp#739-741 but sadly then only treats the folder as archive making it non-distinguishable from a regular archive: https://searchfox.org/comm-central/rev/bf3f924c4586bf425662b40811e9ec47329f92d9/mailnews/imap/src/nsImapMailFolder.cpp#1163-1165

One would have to create an "all mail" folder flag and set it to reliably detect the "All Mail" folder later.

So where to we go from here? We suggested to file a ticket at https://bugzilla.mozilla.org/. You could then submit a patch there.

Or optimise the code posted above, right now it's of O(N^2) complexity, and submit it here and we'll integrate it together with creating an additional AllMail folder flag here: https://searchfox.org/comm-central/rev/bf3f924c4586bf425662b40811e9ec47329f92d9/mailnews/base/public/nsMsgFolderFlags.idl#78-85 and setting it where indicated (nsImapMailFolder.cpp#1163-1165).

Your test would then become hdr.folder.flags & Ci.nsMsgFolderFlags.AllMail.

mayerwin commented 7 months ago

Fantastic. How about this revised version that is approximately O(N):

GlodaSyntheticView.prototype.reportResults = function(aItems) {
    try {
        let messageMap = new Map();
        let selectedMessageUpdated = false;

        // First pass: Store messages in a map with messageId as the key
        aItems.forEach(item => {
            const hdr = item.folderMessage;
            if (hdr) {
                if (!messageMap.has(hdr.messageId)) {
                    messageMap.set(hdr.messageId, []);
                }
                messageMap.get(hdr.messageId).push(item);
            }
        });

        // Second pass: Filter out duplicates
        const filteredItems = aItems.filter(item => {
            const hdr = item.folderMessage;
            if (hdr && hdr.folder.name === "All Mail") {
                const duplicates = messageMap.get(hdr.messageId);
                if (duplicates.length > 1) {
                    // Check if any duplicate is not in "All Mail"
                    const hasNonAllMailDuplicate = duplicates.some(dupItem => dupItem.folderMessage.folder.name !== "All Mail");

                    if (hasNonAllMailDuplicate) {
                        // Update selectedMessage if necessary
                        if (hdr.messageId === this.selectedMessage.messageId && !selectedMessageUpdated) {
                            this.selectedMessage = duplicates.find(dupItem => dupItem.folderMessage.folder.name !== "All Mail").folderMessage;
                            selectedMessageUpdated = true;
                        }
                        return false;
                    }
                }
            }
            return true;
        });

        // Call the original reportResults function with the filtered list
        GlodaSyntheticView.prototype.originalReportResultsGlobal88319.call(this, filteredItems);
    } catch (e) {
        console.error('An error occurred while filtering items:', e);
    return true;
    }
}

FYI I used ChatGPT 4 to see if it'd be able to do this, and it is exactly as I'd have done it, impressive as always: https://chat.openai.com/share/4e93d7f6-44af-41d6-b448-f78201e64e8e

Do you need me to do anything else (e.g. file a ticket)? I guess it'll carry more weight if the final patch is submitted by you.

Betterbird commented 7 months ago

This can be optimised further:

  1. If none of the arrays in the map are longer then 1, then there are no duplicates.
  2. If none of the folders involved in the first path is Gmail's "All Mail", you can equally skip the filter step.

Anyway, it will look different when integrating into https://searchfox.org/comm-central/rev/a626faac5a2a2c0a1ae0e81654c8e566def0a6e6/mailnews/db/gloda/modules/GlodaSyntheticView.jsm#79 instead of patching the function and then calling the original.

We'll take it from here including the AllMail flag discussed above.

Betterbird commented 7 months ago

OK, we added this patch: https://github.com/Betterbird/thunderbird-patches/blob/main/115/features/28-feature-no-gloda-all-mail-dupes.patch

This is the code, I think it's better and more efficient that ChatGPT's version. We're not great friends of "JS art" using .filter() and .find() when a proper data structure and a loop can do the same in a more legible way. The "data structure" here is the array with the "All Mail" hits at the end and the non-"All Mail" hits at the front, so no need to search later:

  reportResults(aItems) {
    let messageMap = new Map();

    // First pass: Store messages in a map with messageId as the key.
    for (let item of aItems) {
      let hdr = item.folderMessage;
      if (hdr) {
        if (!messageMap.has(hdr.messageId)) {
          messageMap.set(hdr.messageId, []);
        }
        if (hdr.folder.flags & Ci.nsMsgFolderFlags.AllMail) {
          // Push the header to the end of the array.
          messageMap.get(hdr.messageId).push(hdr);
        } else {
          // Add the header to the front of the array.
          messageMap.get(hdr.messageId).unshift(hdr);
        }
      }
    }

    // Second pass: Traverse the map, report everything except duplicates
    // from "All Mail".
    // The map values are arrays of headers, with the ones not in "All Mail"
    // at the front.
    for (let value of messageMap.values()) {
      if (value.length == 1) {
        let hdr = value[0];
        this.searchListener.onSearchHit(hdr, hdr.folder);
      } else if (value[0].folder.flags & Ci.nsMsgFolderFlags.AllMail) {
        // First hit is in "All Mail" already, so report all hits.
        for (let hdr of value) {
          this.searchListener.onSearchHit(hdr, hdr.folder);
        }
      } else {
        // First hit isn't in "All Mail", so report all hits not in "All Mail".
        for (let hdr of value) {
          if (hdr.folder.flags & Ci.nsMsgFolderFlags.AllMail) {
            // Make sure `this.selectedMessage` references a message we're reporting.
            if (this.selectedMessage.messageId == hdr.messageId) {
              this.selectedMessage = value[0];
            }
            break;
          }
          this.searchListener.onSearchHit(hdr, hdr.folder);
        }
      }
    }
  },

Let us know if you see a problem. We only tested this lightly: A search with a message in a folder and in the "All Mail" folder only showed one message in the result list.

You can download a test build here: https://www.betterbird.eu/downloads/WindowsInstaller/betterbird-115.6.0-bb21-latest-build8-test2.en-US.win64.installer.exe

mayerwin commented 7 months ago

I tested and it seems to work well! I updated my add-on with the performance improvement as well even if it hopefully won't be necessary with TB anymore (I couldn't add the "All Mail" detection flag though). Will you submit the patch to the TB codebase?

Betterbird commented 7 months ago

Will you submit the patch to the TB codebase?

If you file a ticket at https://bugzilla.mozilla.org/, we'll submit the patch.

mayerwin commented 7 months ago

Done: https://bugzilla.mozilla.org/show_bug.cgi?id=1875681 Thanks.

Betterbird commented 7 months ago

As the TB folks don't value our contributions, we refrain from "advertising" our features. If you want, mention the BB fix https://github.com/Betterbird/thunderbird-patches/blob/main/115/features/28-feature-no-gloda-all-mail-dupes.patch in your bug report.

mayerwin commented 7 months ago

Done. Thanks again for your very valuable contributions (with this and all the other features).