qbittorrent / qBittorrent

qBittorrent BitTorrent client
https://www.qbittorrent.org
Other
26.83k stars 3.87k forks source link

Add a simple syntax for more advanced filters #21059

Open wdeweijer opened 1 month ago

wdeweijer commented 1 month ago

Suggestion

Currently, filtering torrents or files without regular regular expressions performs just a simple (case-insensitive) string search. This is unexpected because most search engines provide a simple syntax to find more results. For example:

  1. Searching for john doe also matches doe, john and john saw a doe
  2. Words can be excluded by prefixing them with a minus sign doe -john should match jane doe but not john doe
  3. Strings enclosed in quotes are matched literally (as the current filtering works) so that "john doe" matches only john doe and not doe, john or john.doe

While (2) and (3) are a bit more for power-users, (1) is usually the expected behaviour, so this feature request is specifically only about (1). (Although (2) and (3) are probably relatively easy to implement once (1) is.)

Currently, the context menu for the torrent filter provides a toggle to turn on regular expressions. Similarly (and slightly inconsistently) in 5.0, the context menu for the file filter has a submenu to select either fixed string, wildcard, or regex. If this more advanced filtering is implemented I would suggest adding it to this list (perhaps even as the default).

Use case

I have several thousand torrents that not always follow the same naming conventions. Instead of spaces between words, sometimes periods or underscores are used. Filtering the torrents by john doe should also match john.doe and john_doe as well as doe, john.

Extra info/examples/attachments

It seems like the Qt QSortFilterProxyModel (which is used to filter the transfer list) only supports filtering by fixed strings, wildcards, or regular expressions. This is unfortunate since regular expressions aren't the best solution to this problem. However, Qt does support lookahead. Hence a solution outlined here can work.

Indeed, filtering for (?=.*john)(?=.*doe) currently works has expected and I don't notice a performance hit with several thousand torrents (although this should be tested with much more).

This filtering behaviour (together with "Filter by" combo box) could also help searching for tags as suggested in https://github.com/qbittorrent/qBittorrent/issues/11928.

glassez commented 1 month ago

It seems like the Qt QSortFilterProxyModel (which is used to filter the transfer list) only supports filtering by fixed strings, wildcards, or regular expressions. This is unfortunate since regular expressions aren't the best solution to this problem. However, Qt does support lookahead. Hence a solution outlined here can work.

Indeed, filtering for (?=.*john)(?=.*doe) currently works has expected and I don't notice a performance hit with several thousand torrents (although this should be tested with much more).

Would you mind to provide Pull Request providing such a functionality?

wdeweijer commented 1 month ago

I just barely know enough about C++ and Qt to get this far! I also currently don't have access to working build environment.

I would imagine the most basic implementation in https://github.com/qbittorrent/qBittorrent/blob/9feefc814497345670caa13b0b843343a269779d/src/gui/transferlistwidget.cpp#L1327 to separate the given string by spaces, surrounding each resulting word with (?=.* and ), and then concatenating it all together.

The words would need to be escaped to prevent from interacting with the rest of the regex. I think Qt provides this.

wdeweijer commented 1 month ago

I think this works. It implements all three of the requested features. I haven't tested the performance with more than a handful of torrents but I don't have a good way of testing more.

However, using regular expressions to solve this problem is a bit of a hack. A better solution would be to modify https://github.com/qbittorrent/qBittorrent/blob/9feefc814497345670caa13b0b843343a269779d/src/gui/transferlistsortmodel.cpp#L264

In addition, my assertion that the filtering currently uses a simple string search was wrong, it actually allows wildcards. This solution replaces this wildcard behaviour. It would be better to retain this as an option similar to how it works in 5.0 for the torrent content filtering.

aaronsql2019 commented 1 month ago

Is there a way to get us to FILTER OUT certain phrases?

Right now, I search for the word 'Playlist'. I'd like to filter OUT the word 'Zoey'. In other words, I want to EXCLUDE any titles that have the word Zoey in the title.

In Google Searches, I can search for "purple monkeys" -stuffed

This will show me items that have the EXACT PHRASE "purple monkeys" and then google will filter OUT any results that contain the word 'stuffed'. (For example, I want to find purple monkeys on YouTube, but I dont want to see any stuffed (animals).

I think that there are actually THOUSANDS of these so-called 'Google Dorks'. It's called a 'Google Dork' when you search for certain strings.

For example, if I wanted to search for spreadsheets in google I'd filter by FileType and then I'd search for the word SSN to find (hopefully column headers) that pertain to 'Social Security Numbers'.

Please let me know if I can give a better description, or if I can help test this out. I'd be glad to.

I use a fair number of Search Plugins. Right now, when I search on the word 'Playlist' there are 2164 results (across my search plugins). From what I see at the top of page1, probably half of these results pertain to the show 'Zoeys Extraordinary Playlist'. If I could EXCLUDE the word Zoey, then I could filter (OUT) 25% of the results. Just with ONE simple exclude keyword search.

That's the thing about searches. There are INCLUDE searches (only show me items that DO HAVE the word 'Playlist') vs EXCLUDE style searches (HIDE any results if they contain a certain keyword).

I hope I haven't broken any unwritten rules about media file names. I'm OBVIOUSLY not looking for 'Zoeys Extraordinary Playlist' episodes. I actually want to filter those OUT.

All I do is collect Linux ISOs, should I change this dialogue to have those keywords instead??

thalieht commented 1 month ago

Is there a way to get us to FILTER OUT certain phrases?

Not as convenient as prepending a minus but at least it's something: https://github.com/qbittorrent/qBittorrent/issues/10241#issuecomment-459837717

aaronsql2019 commented 1 month ago

Thanks so much. On that #10241, they say 'search should be diacritics and case insensitive'.

I THOUGHT I mean, I REALLY thought that I found a situation the other day where search was NOT case insensitive. I was searching for something, and it didn't show up.

Then I searched for UPPERCASE first letter, and it showed up. I didn't write down the details, but I wish I had.

Is there a LOG of what you have searched on? (and a way to clean that log). Better yet, if you search on 'purple monkeys' and it just BLATANTLY doesn't return anything, I'd LIKE to see a 'Re-Search' button.

I had a couple of situations the other day where searches weren't working. Then I re-searched on the same phrase again, and it started giving a bunch of results.

I don't mean to be difficult. I'd just love to verify whether this is correct and implemented: search should be diacritics and case insensitive