tsto / notmuchfs

A virtual maildir file system for notmuch queries
Other
89 stars 11 forks source link

thread reconstruction #5

Open mturquette opened 8 years ago

mturquette commented 8 years ago

Is it possible for notmuchfs to reconstruct whole threads instead of only matching individual messages?

I deal with a high volume of mail (linux-kernel mailing list) and my workflow often consists with searching for mail sent from a specific author and then, depending on the content of the email from that author, marking the whole thread as read so that I can get back to inbox zero.

Unfortunately with the individual message matching in notmuchfs I only mark those messages as read, which won't capture the whole thread.

This behavior is the default in alot and I'm pretty addicted to it, even after migrating fully over to notmuchfs + mutt.

mturquette commented 8 years ago

OK, I was able to quickly hack up a prototype using threads instead of messages. The notmuch api is well documented and really easy to work with, even for first timers! Here is the hack:

diff --git a/mutt/bin/notmuch_tag b/mutt/bin/notmuch_tag
index 27b6b33..3f23491 100755
--- a/mutt/bin/notmuch_tag
+++ b/mutt/bin/notmuch_tag
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/bin/bash
 #
 # Script to prompt the user for the tag string to use (e.g. '+tag1 -tag2'),
 # then apply it to the given message(s).
@@ -28,6 +28,7 @@ TAGS=$1
 # Fetch the list of message IDs.
 while true; do
   read -p "Message-ID: " ID
+  echo $ID
   if [ -z "$ID"  ]; then
     break
   fi
diff --git a/notmuchfs.c b/notmuchfs.c
index 8c1c6df..5414ee6 100644
--- a/notmuchfs.c
+++ b/notmuchfs.c
@@ -385,7 +385,8 @@ typedef struct
   * @{
   */
  notmuch_query_t    *p_query;
- notmuch_messages_t *p_messages;
+ notmuch_threads_t  *p_threads;
+ //notmuch_messages_t *p_threads;
  /** @} */

  /** This is for type == OPENDIR_TYPE_BACKING_DIR. */
@@ -465,8 +466,8 @@ static int notmuchfs_opendir (const char* path, struct fuse_file_info* fi)
      dir_fd->next_offset = 1;
      dir_fd->p_query = notmuch_query_create(p_ctx->db, trans_name);
      if (dir_fd->p_query != NULL) {
-       dir_fd->p_messages = notmuch_query_search_messages(dir_fd->p_query);
-       if (dir_fd->p_messages == NULL) {
+       dir_fd->p_threads = notmuch_query_search_threads(dir_fd->p_query);
+       if (dir_fd->p_threads == NULL) {
          notmuch_query_destroy(dir_fd->p_query);
          dir_fd->p_query = NULL;
          database_close(p_ctx);
@@ -505,8 +506,8 @@ static int notmuchfs_releasedir (const char *path, struct fuse_file_info *fi)
  opendir_t *dir_fd = (opendir_t *)(uintptr_t)fi->fh;
  if (dir_fd != NULL) {
    if (dir_fd->type == OPENDIR_TYPE_NOTMUCH_QUERY) {
-     if (dir_fd->p_messages != NULL)
-       notmuch_messages_destroy(dir_fd->p_messages);
+     if (dir_fd->p_threads != NULL)
+       notmuch_threads_destroy(dir_fd->p_threads);
      if (dir_fd->p_query != NULL)
        notmuch_query_destroy(dir_fd->p_query);

@@ -615,18 +616,28 @@ static int notmuchfs_readdir (const char            *path,
         break;
       }

-      notmuch_message_t *p_message = NULL;
-      while (res == 0 &&
-             (p_message = notmuch_messages_get(dir_fd->p_messages)) != NULL) {
-
-        res = fill_dir_with_message(dir_fd, p_message, buf, filler);
-
-        notmuch_message_destroy(p_message);
-        if (res == INT_MAX) {
-          res = 0;
+      notmuch_thread_t *p_thread = NULL;
+      while ((p_thread = notmuch_threads_get(dir_fd->p_threads)) != NULL) {
+        notmuch_messages_t *p_messages = NULL;
+        if ((p_messages = notmuch_thread_get_messages(p_thread)) == NULL)
           break;
+
+        notmuch_message_t *p_message = NULL;
+        while (res == 0 &&
+             (p_message = notmuch_messages_get(p_messages)) != NULL) {
+          res = fill_dir_with_message(dir_fd, p_message, buf, filler);
+
+          notmuch_message_destroy(p_message);
+          if (res == INT_MAX) {
+            res = 0;
+            break;
+          }
+          //notmuch_messages_move_to_next(dir_fd->p_messages);
+          notmuch_messages_move_to_next(p_messages);
         }
-        notmuch_messages_move_to_next(dir_fd->p_messages);
+        notmuch_thread_destroy(p_thread);
+        //notmuch_messages_move_to_next(dir_fd->p_threads);
+        notmuch_threads_move_to_next(dir_fd->p_threads);
       }
       break;
      }

And it works! Sort of ... Strangely in mutt I can never load more than 28 messages at a time. No matter the search, if there are more than 28 messages then it only returns the 28 most recent and truncates the rest (they don't exist in the notmuchfs directory).

find ~/mail/.notmuch/notmuchfs/from:heiko\ and\ tag:linux-clk\ and\ tag:unread/cur/|wc
     29     145    5865

(29 lines instead of 28 because wc also catches "home/mturquette/mail/.notmuch/notmuchfs/from:heiko and tag:linux-clk and tag:unread/cur/", without any actual email)

Any thoughts on what my code above is doing wrong? I didn't go digging too much into all of the last_slash terminator stuff. I'll revisit this some day but I don't have time now, so hopefully someone more well versed in the code base can take a look?

mturquette commented 8 years ago

OK, I looked a bit more into readdir and the filler stuff. I now realize that the first two lines are "." and "..". Also the filler always fills up at 30 messages:

readdir filling dir #home#mturquette#mail#baylibre#linux-clk#cur#1449182126_2.1793.quark,U=3404,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2, at 29
readdir filling dir #home#mturquette#mail#baylibre#linux-clk#cur#1449182126_3.1793.quark,U=3405,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2, at 30
readdir filling dir #home#mturquette#mail#baylibre#linux-clk#cur#1449182127_0.1793.quark,U=3408,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2, at 31
readdir filler full "#home#mturquette#mail#baylibre#linux-clk#cur#1449182127_0.1793.quark,U=3408,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2,".
readdir filling dir #home#mturquette#mail#baylibre#linux-clk#cur#1449182118_5.1793.quark,U=3365,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2, at 31
readdir filler full "#home#mturquette#mail#baylibre#linux-clk#cur#1449182118_5.1793.quark,U=3365,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2,".
readdir filling dir #home#mturquette#mail#baylibre#linux-clk#cur#1449182118_2.1793.quark,U=3362,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2, at 31
readdir filler full "#home#mturquette#mail#baylibre#linux-clk#cur#1449182118_2.1793.quark,U=3362,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2,".
readdir filling dir #home#mturquette#mail#baylibre#linux-clk#cur#1449182108_3.1793.quark,U=3309,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2, at 31
readdir filler full "#home#mturquette#mail#baylibre#linux-clk#cur#1449182108_3.1793.quark,U=3309,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2,".
readdir filling dir #home#mturquette#mail#baylibre#important#cur#1448785675_0.4858.quark,U=8301,FMD5=48818f3da3d36ad762480b319b56f588:2, at 31
readdir filler full "#home#mturquette#mail#baylibre#important#cur#1448785675_0.4858.quark,U=8301,FMD5=48818f3da3d36ad762480b319b56f588:2,".
readdir filling dir #home#mturquette#mail#baylibre#linux-clk#cur#1449182067_3.1793.quark,U=3079,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2, at 31
readdir filler full "#home#mturquette#mail#baylibre#linux-clk#cur#1449182067_3.1793.quark,U=3079,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2,".
readdir filling dir #home#mturquette#mail#baylibre#linux-clk#cur#1449181919_1.1793.quark,U=2172,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2, at 31
readdir filler full "#home#mturquette#mail#baylibre#linux-clk#cur#1449181919_1.1793.quark,U=2172,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2,".
readdir filling dir #home#mturquette#mail#baylibre#linux-clk#cur#1449181875_3.1793.quark,U=1924,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2,S at 31
readdir filler full "#home#mturquette#mail#baylibre#linux-clk#cur#1449181875_3.1793.quark,U=1924,FMD5=9a405a83facba9ed7e2ca2d6ce1fbbf0:2,S".

So combining the fact that the first two entries don't show up in mutt (e.g. "." and ".."), this explains why mutt only shows 28 entries (e.g. 30 less 2).

Why does the filler fill up? I have no idea. I started digging into libfuse/fuse.c but hopefully someone else can point me in the right direction.

tsto commented 5 years ago

Thank you for the patch. You have changed the behavior to threads, from messages. I think we'd want to do this configurably, so that the user can tune to their workflow. Perhaps with something like 'type=thread' in the notmuchfs directory name to trigger this.

mturquette commented 5 years ago

I'm no longer using notmuchfs as part of my email workflow, so feel free to close this issue.