RFD: Refine taglist - Githubissues

mjg commented 3 years ago

This is just a Request For Discussion, not a real PR yet.

The main theme is supercharging the taglist buffer so that you can search and navigate quickly in alot. So far, :taglist simply lists all tags in the db. We can do more!

The first patches are clean-up to make similar buffers (bufferlist, namedqueries, taglist) behave in more expected ways (UI+code):

do not advertise filtfun for namedqueries since it is not implemented
open only one namedqueries buffer (just as for taglist and bufferlist - they cannot differ)
correct default function signatures

The first main patch implements an argument for taglist which allows to restrict the taglist: :taglist foo returns only tags containing foo. In fact, full regex are allowed. This is similar but not equal to notmuch --output=tags tag:/foo/. Indeed, :taglist foo helps you find those tags which you are looking for, whereas the notmuch command lists more tags (from threads which match the tag query).

Next, :taglist --msgs restricts the tags to those from the current search or thread buffer. This let's you easily find all tags "from a year" for example: :search date:2020 followed by :taglist --msgs.

Finally, to make this resulting taglist from buffers useful for refining searches, the last patch in the series makes :select in a taglist buffer open a new search with a refined search: the search we came from (if we came from a search or thread buffer with :taglist --msgs) ANDed with the search for the selected tag.

Yes, tests are missing and need to be done. But also I'm wondering what the final UI should look like, in particular:

If taglist buffers can differ (different queries) should we allow opening multiple of them? (Probably yes)
Should the restriction mode :taglist --msgs be the default on search and thread buffers (and the opposite flag --global be introduced)?
Should the taglist buffer offer two different modes of select, i.e. searching just for the tag and the combined search, and which should be default?
Do we want a taglistprompt command to speed up regex searches for tags?
Besides, I'm not sure how useful the current tags parameter to taglist is in the UI - how do you specify a list, and how useful is it to get exactly that list back?

mjg commented 3 years ago

codeclimate and Codacy have different ideas about preferred line lengths ... I've fixed the indentations, although this is really just RFD and may need refactoring, different UI or different defaults. Plus tests and doc :)

mjg commented 3 years ago

Travis doc build "fails" because I did document the new arguments ... which gives a none empty diff.

pazz commented 3 years ago

Hey Michael, thanks for this, it looks really nice already.

I personally don't use the taglist much, which is why I did not put much effort into making it nice. But I agree that this could be significantly improved.

Quoting Michael J Gruber (2020-10-17 17:09:18)

The first patches are clean-up to make similar buffers (bufferlist, namedqueries, taglist) behave in more expected ways (UI+code):

• do not advertise filtfun for namedqueries since it is not implemented • open only one namedqueries buffer (just as for taglist and bufferlist - they cannot differ) • correct default function signatures

Mostly fine with me, except that I'd prefer to remove the hard-limit of 1 taglist buffer instead of imposing the same on namedqueries. This would simply the code and let the user decide. It was a weird design choice on my part in retrospect..

The first main patch implements an argument for taglist which allows to restrict the taglist: :taglist foo returns only tags containing foo. In fact, full regex are allowed. This is similar but not equal to notmuch --output=tags tag:/foo/. Indeed, :taglist foo helps you find those tags which you are looking for, whereas the notmuch command lists more tags (from threads which match the tag query).

Very nice!

Next, :taglist --msgs restricts the tags to those from the current search or thread buffer. This let's you easily find all tags "from a year" for example: :search date:2020 followed by :taglist --msgs.

Nice idea! Two comments: Firstly, I'm not sold on the argument name as it is not really self-explanatory. Can we come up with a better one?

Secondly, this patch does not play well with the (recently pushed) move to notmuch2. Below is a trace after a taglist --msgs on your patchset on top of alot master (with notmuch2).

  File "/home/pazz/projects/alot/alot/ui.py", line 723, in apply_command
    cmd.apply(self)
  File "/home/pazz/projects/alot/alot/commands/globals.py", line 586, in apply
    tags = list(ui.dbman.query(ui.current_buffer.querystring).
AttributeError: 'DBManager' object has no attribute 'query'

Finally, to make this resulting taglist from buffers useful for refining searches, the last patch in the series makes :select in a taglist buffer open a new search with a refined search: the search we came from (if we came from a search or thread buffer with :taglist --msgs) ANDed with the search for the selected tag.

cool.

Yes, tests are missing and need to be done. But also I'm wondering what the final UI should look like, in particular:

• If taglist buffers can differ (different queries) should we allow opening multiple of them? (Probably yes)

Yea, I'd say so.

• Should the restriction mode :taglist --msgs be the default on search and thread buffers (and the opposite flag --global be introduced)?

No strong opinion on that.

• Should the taglist buffer offer two different modes of select, i.e. searching just for the tag and the combined search, and which should be default?

That'd be helpful I guess. Perhaps we can replace the "select" command anyhow: IIRC it was only there to get a uniform binding across buffer types but in the unlikely case where someone will ever want to use something different from "return", they can remap this for all 5? buffer types.

More to the point here, yes that should probably be included, with a default key binding to shift-return or similar.

• Do we want a taglistprompt command to speed up regex searches for tags?

AFAIR, the Xprompt commands were only there to open the normal prompt with a pre-populated command stub. What you're probably thinking of is to adjust the command completer in some way that would enable tab completion for command prefixes of the form taglist cmdprefix?

https://github.com/pazz/alot/blob/master/alot/completion/command.py#L133

• Besides, I'm not sure how useful the current tags parameter to taglist is in the UI - how do you specify a list, and how useful is it to get exactly that list back?

Not sure what you mean.

One additional cool feature would be to have a "tag remove" command that would simply remove a (selected) tag from all messages in your database. I have so many auto-generated mailinglist tags "list/blaa" that clutter the taglist and make it unusable as it is.

Also, one could think about a tag-cloud like presentation for this (or derived) buffer. urwid actually allows for larger fonts somehow..

pazz commented 3 years ago

Travis doc build "fails" because I did document the new arguments ... which gives a none empty diff.

It looks like it fails because you did not commit the generated docs.

index 955b6ff..880c9e1 100644
894--- a/docs/source/usage/modes/global.rst
895+++ b/docs/source/usage/modes/global.rst
896@@ -226,6 +226,10 @@ The following commands are available globally:
897 
898     opens taglist buffer
899 
900+    argument
901+        regular expression to match
902+
903     optional arguments
904+        :---msgs: list tags from displayed messages
905         :---tags: tags to display
906

try make -C docs cleanall html and check in the autogenerated changes?

mjg commented 3 years ago

So, here it is on top of notmuch2 bindings.

In addition, I took up the suggestions from above (do not limit buffers to 1 instance, except for the bufferlist buffer; rename the option) and went ahead with changing the default behaviour of taglist to be "buffer local" and with select to be "stay within current search and refine by tag"; at the same time allowing to "break out" to a global search again. BTW: urwid does not return shift enter so I went for meta enter.

Technically, I had to use a cffi binding for collect_tags() directly now since it is missing. I hope this is cool. (I might prod upstream to include it.)

Next up: look at those CI reports (can I get them ahead of the push?) and implement a few more taglist commands (such as remove tag from all messages).

mjg commented 3 years ago

About pep8: a lot of files on master do not conform to pep8. Should I rebase this series so that at least the new lines do conform? Or do some overall pep8 sanitize commit upfront?

mjg commented 3 years ago

So, I'll stop after these two more commits ... For the last one I need help: I cannot get the tagline to disappear from the list after I delete a tag. Also, you may prefer an extra taglist widget similar to the namedqueries thing. So the last one is more to demonstate what is possible: quickly acting upon selected tags.

mjg commented 3 years ago

Wait: did you not claim to remove this "one buffer only" restriction completely both for taglist and bufferlist buffers? bf5bd3e claims to do this but only does it for Taglist. Should that patch not do the same for BufferListBuffer and thereby obsolete this commit?

I don't mind either way. It's still RFD :) With the current series, the bufferlist buffer is the only one which can exist only once. This is meant to help those who have not bound bnext or are not aware of it. But going for full consistency is fine with me if you prefer that.

I'll rebase for make doc and pep8 anyways.

mjg commented 3 years ago

I tried this with a re in quotes because I wasn't sure. The command taglist "lists/*" blows up:

DEBUG:globals:CMDLINE: taglist "lists/*"
DEBUG:ui:search command string: "taglist "lists/*""
DEBUG:__init__:mode:search got commandline "taglist "lists/*""
DEBUG:__init__:ARGS: ['taglist', '"lists/*"']
DEBUG:__init__:cmd parms {'globally': False, 'tags': None, 'match': '"lists/*"'}
ERROR:ui:Traceback (most recent call last):
  File "/home/pazz/projects/alot/alot/ui.py", line 723, in apply_command
    cmd.apply(self)
  File "/home/pazz/projects/alot/alot/commands/globals.py", line 598, in apply
    ui.buffer_open(buffers.TagListBuffer(ui, tags, self.filtfun, querystring, self.match))
  File "/home/pazz/projects/alot/alot/buffers/taglist.py", line 25, in __init__
    self.rebuild()
  File "/home/pazz/projects/alot/alot/buffers/taglist.py", line 94, in rebuild
    self.taglist.set_focus(focusposition % len(displayedtags))
ZeroDivisionError: integer division or modulo by zero

Oh yes, I have to check for "no match at all". Technically, this problem is not new, but I guess nobody had a database with 0 tags in total ;)

pazz commented 3 years ago

Sorry my bad: you claimed to have taglist and named queries buffers behave the same and leave bufferlist as is. I am not sure what's best here: have all three behave equally makes sense and keeps the code cleaner, but in a way several bufferlists are pointless (which has never before kept us from allowing it :D) For sake of keeping this PR lean, let's keep it as is.

Quoting Michael J Gruber (2021-01-01 15:38:19)

Wait: did you not claim to remove this "one buffer only" restriction
completely both for taglist and bufferlist buffers?
bf5bd3e claims to do this but only does it for Taglist. Should that patch
not do the same for BufferListBuffer and thereby obsolete this commit?
I don't mind either way. It's still RFD :) With the current series, the bufferlist buffer is the only one which can exist only once. This is meant to help those who have not bound bnext or are not aware of it. But going for full consistency is fine with me if you prefer that.

I'll rebase for make doc and pep8 anyways.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.*

mjg commented 3 years ago

Sorry my bad: you claimed to have taglist and named queries buffers behave the same and leave bufferlist as is. I am not sure what's best here: have all three behave equally makes sense and keeps the code cleaner, but in a way several bufferlists are pointless (which has never before kept us from allowing it :D) For sake of keeping this PR lean, let's keep it as is.

Just to get it right: "as is" means "before this series", so I should simply not touch bufferlist handling, right? It is not technically related to the series anyways and can wait. So I'll keep this PR strictly to "supercharging" the taglist.

How about the change in "taglist, bufferlist: provide defaults of correct signature"?

pazz commented 3 years ago

grr, github seems to have disabled the feature to accumulate comments over commits when writing reviews. I hope my comments are somehow related to the individual commits..

BTW: urwid does not return shift enter so I went for meta enter.

WTF? really?

pazz commented 3 years ago

Quoting Michael J Gruber (2021-01-01 15:53:42)

Just to get it right: "as is" means "before this series", so I should simply not touch bufferlist handling, right? It is not technically related to the series anyways and can wait. So I'll keep this PR strictly to "supercharging" the taglist.

Yes, that sounds good.

How about the change in "taglist, bufferlist: provide defaults of correct signature"?

sure. I have no opinion on this.

About pep8: a lot of files on master do not conform to pep8. Should I rebase this series so that at least the new lines do conform? Or do some overall pep8 sanitize commit upfront?

I usually try to adhere to pep8, except if the character limit per line leads to ridiculous constructs. So yes, if you don't mind try to stick to pep8 for your changes in this PR, but there's no need to fix code outside.

Technically, I had to use a cffi binding for collect_tags() directly now since it is missing. I hope this is cool. (I might prod upstream to include it.)

Fine for now, but yes: the nicer solution would be to lobby for this command in the notmuch2 bindings directly.

Next up: look at those CI reports (can I get them ahead of the push?)

I don't think so, but that's no problem as you can always rebase -i and force-push to keep the PR clean. Don't worry about this.

bc507fd54daa3aa68de58c2b0c453aabeed7c48c

Nice. two questions: do you distinguish between search and thread as "original" buffer just before the taglist is opened? Secondly, I'm confused by the statusline. For instance:

[1: taglist] for "tag:inbox AND NOT tag:killed" matching "*"

This means tags matching * in messages matching "...", right?

acdff3c6209a0b29fbea1e6df56feb0116c26ab6 : Why not enrich the taglist by another column that shows separate TextWidgets in case the tag is hidden etc, or the "original" tagstring?

Final comment: Is it possible to right-align the final two columns? That should be doable without changes in your code and only by changing (my) theme file right?

mjg commented 3 years ago

I'm torn about this one. One the one hand it enables a natural "refine by tag" functionality (should a select then not refocus the original buffer instead of opening a new one?). On the other hand this is really not intuitive if you start with a named search or bufferlist buffer..

It's somehow the choice between the philosophies "do what you think I mean" and "do exactly what I tell you to", in this case the new default behaviour depending on which buffer we come from or explicit user choice.

Technically, it's either having an extra option --globally (buffer-local being the default, as in this series right now) or an extra option --locally or some better name (and global by default, as is current behaviour). I would always bind this to a key anyways. Maybe not changing the current behaviour is less intrusive and more in line with alot's overall command ui.

[I'm also kind of confused by github's comment interface right now, I'll answer by replying one comment by one.]

mjg commented 3 years ago

Technically, I had to use a cffi binding for collect_tags() directly now since it is missing. I hope this is cool. (I might prod upstream to include it.) Fine for now, but yes: the nicer solution would be to lobby for this command in the notmuch2 bindings directly.

Submitted on the notmuch ML (from this branch).

do you distinguish between search and thread as "original" buffer just before the taglist is opened?

Yes, by type of ui.current_buffer; depending on that I query for tags and pass that to the taglist buffer.

Secondly, I'm confused by the statusline. For instance: [1: taglist] for "tag:inbox AND NOT tag:killed" matching "" This means tags matching in messages matching "...", right? [acdff3c]

Yes, maybe the wording in the status line can be clearer. First all tags from the query are collected (i.e. the union of all tags of messages which match the query), and then those tages are matched (textually) against the regexp. I try to express "global query as "" (as notmuch does) and "no regexp" as "" (even though "" or ".*" would be technically correct). Dunno.

Why not enrich the taglist by another column that shows separate TextWidgets in case the tag is hidden etc, or the "original" tagstring?

I wanted the alignment to be the same as before - the existing alignment looked good to me, since this extra info as usually empty. Also, typical translations are 1 character symbols replacing a word like attachment, so it looks funny to have that 1 character symbol in the tag name column and then the translation in the usually empty extra column (I tried).

Final comment: Is it possible to right-align the final two columns? That should be doable without changes in your code and only by changing (my) theme file right?

count is aligned right, unread_count is aligned left so that they "line up" in the middle, exactly like in the namedqueries buffer. At least that's how I wrote the code adjusted the default theme (by copying from that buffer). Both counts together end up in one TextWidget (and thus one column).

For me, this sometimes gets distorted by some "flag translation symbols", although this depends on the font and the terminal; it happens in other widgets, too.

pazz commented 3 years ago

OK cool. Thanks for the update. This looks almost ready then. Apart from the codeclimate/pep8 stuff, the issues flagged by codacity look reasonable to me. E.g. refactoring code with pass commands. The travis builds seem to fail only because of outdated documentation, which can be autogenerated (as above). Not sure about tests.

mjg commented 3 years ago

I tried to catch the errors with no tags now and managed to solve the "NO WORKEE" for untag. I've also run each commit through autopep8 though codeclimate seems to think otherwise. If the new defaults (regarding branch local) are OK I'll hunt down the remaining code climate issues manually. [Edit: Turns out autopep8 does not do what it claims to do without allowing it to be "aggressive" - it left longer lines which need just whitespace changes. Might have to rerun it.]

pazz commented 3 years ago

I can confirm that the bug with an empty taglist is fixed now.

Now I wonder why we even need the extra filter for tag strings. If I only want to see a certain subset of tags then I could amend the "original" query by adding AND tag:X AND tag:Y...

mjg commented 3 years ago

I can confirm that the bug with an empty taglist is fixed now.

Now I wonder why we even need the extra filter for tag strings. If I only want to see a certain subset of tags then I could amend the "original" query by adding AND tag:X AND tag:Y...

You mean the regexp match? I guess it depends on the way you tag. Say, you deal with a lot of bachelor theses. Do you tag e-mails as BA/Einstein, BA/Witten or similar? Then a regexp match for BA gives you all bachelor projects without long ORs or constantly out-of-sync named queries. You could tag them as BA, Einstein and BA, Witten instead, but that means you always have to use AND queries if you search within a specific project.

But maybe I misunderstand your question.

pazz commented 3 years ago

Quoting Michael J Gruber (2021-01-03 16:39:04)

I can confirm that the bug with an empty taglist is fixed now.

Now I wonder why we even need the extra filter for tag strings. If I only
want to see a certain subset of tags then I could amend the "original"
query by adding AND tag:X AND tag:Y...
You mean the regexp match? I guess it depends on the way you tag. Say, you deal with a lot of bachelor theses.

I do in fact :D

Do you tag e-mails as BA/Einstein, BA/Witten or similar?

No, never thought of it..

Then a regexp match for BA gives you all bachelor projects without long ORs or constantly out-of-sync named queries. You could tag them as BA, Einstein and BA, Witten instead, but that means you always have to use AND queries if you search within a specific project.

I see. this is because notmuch queries cannot per se search for tags by regexp. makes sense.

Another thing: I see that the syntax of the taglist commands has both a regex argument and --tags parameter, which lets you list tags. I presume that the latter is obsoleted by a RE that is a disjunction of tags?

mjg commented 3 years ago

I see. this is because notmuch queries cannot per se search for tags by regexp. makes sense.

Exactly.

Another thing: I see that the syntax of the taglist commands has both a regex argument and --tags parameter, which lets you list tags. I presume that the latter is obsoleted by a RE that is a disjunction of tags?

I'm not sure why it was there: You could specify a list of tags (though I never figured out how to pass a list from the UI), and the taglist buffer would return that exact list of tags.

A correct regexp for listing tags foo and bar is ^foo$|^bar$, but often foo|bar gives the same. I would argue for removing the --tags parameter from the taglist command.

BTW: I think it's great to think this feature through thoroughly before merging. If I stop answering quickly in the next days it's because term starts again ...

pazz commented 3 years ago

I would argue for removing the --tags parameter from the taglist command.

Agreed!

mjg commented 3 years ago

Since it came up: The select command in this PR quotes a tag to be on the safe side, i.e. it ands tag:"alot" to the query when refining for the tag alot. I'm not sure if we do this on other occasions (or if it comes up at all) or if there is a better way to do it. Quoting of notmuch search terms is somewhat peculiar (see thread:"{...}").

pazz commented 3 years ago

I don't think I did this before and am not sure if it's necessary unless your tag contains whitespaces.

mjg commented 3 years ago

Hmm, meanwhile I've learned that notmuch does support regex in tag searches now. notmuch search tag:/BA/ gives all threads with tags which contain BA. If I open a taglist buffer on that search (with this PR) I get more tags then just those matching BA (namely those from all messages from threads with tags matching BA). But I'm wondering whether that is sufficient (to skip the match from this PR), or whether it simply means there are two similar but different ways to do two similar but different things.

pazz commented 3 years ago

Quoting Michael J Gruber (2021-01-04 13:32:41)

Hmm, meanwhile I've learned that notmuch does support regex in tag searches now. notmuch search tag:/BA/ gives all threads with tags which contain BA. If I open a taglist buffer on that search (with this PR) I get more tags then just those matching BA (namely those from all messages from threads with tags matching BA). But I'm wondering whether that is sufficient (to skip the match from this PR), or whether it simply means there are two similar but different ways to do two similar but different things.

I don't have strong feelings regarding this, as I don't use the taglist myself. My guiding principle for such things is to keep the codebase lean and clean, because I cannot count on the individual committer to stick around and maintain their code.

Besides, for your use case with tags called BA/X etc, you could simply add an additional tag "BA" to all those messages and search for exactly that one. After all, tags are cheap. I do this for mails sent to mailing lists, which I tag by "lists" and by "list-X"..

pazz / alot

RFD: Refine taglist #1546