TagStudioDev / TagStudio

A User-Focused Photo & File Management System
https://docs.tagstud.io/
GNU General Public License v3.0
5.31k stars 375 forks source link

[Feature Request]: FINAL search engine features for 9.5 #600

Open python357-1 opened 1 week ago

python357-1 commented 1 week ago

Checklist

Description

this issue describes the current consensus on what features the search engine will have for the 9.5 release.

REQUIRED for 9.5:

new queries - a definition and implementation of a "query language" which can be converted to SQL queries to be made against the database

query description allowed values example
tag searches by tag name, with an optional "disambiguation" syntax that specifies which tag you are looking for, if there are multiple matches string tag: Mario
tag: Mario[parent=nintendo]
tag_id searches by the internal ID of a tag int tag_id: 1001
mediatype searches by name property of MediaCategories (will eventually use the translated name) string mediatype: video
filetype searches by the file extension string filetype: jpg
path searches by complete path of files. allows globs string path: folder/*
special searches by special metadata of entries "untagged"
"unlinked"
special: untagged

STRETCH GOALS for 9.5

Computerdores commented 5 days ago

I am interested in working on a parser for this. Feedback on my current WIP grammar would be nice:

ANDList        = ORList ( ["AND"] ORList )* ;
ORList         = Term ( "OR", Term )* ;
Term           = Constraint | "(", ANDList, ")" ;

Constraint     = [ConstraintType, ":"], Literal, "[", PropertyList, "]" ;

ConstraintType = "tag" | "mediaType" ; (* not a complete list *)
PropertyList   = Property, (",", Property)* ;
Property       = ULITERAL, "=", Literal ;
Literal        = ULITERAL | QLITERAL ;

Notes:

python357-1 commented 5 days ago

Looks pretty good to me! I'm just curious, is the reason for ANDLists and ORLists being separate things for which one takes precedence over the other? If so, does the current grammar make ANDs take precedence over ORs? That would be ideal

Computerdores commented 5 days ago

Looks pretty good to me!

Thanks! I am also much more happy with this that with my previous version. This is much simpler and should be easier to parse.

is the reason for ANDLists and ORLists being separate things for which one takes precedence over the other?

Yes!

does the current grammar make ANDs take precedence over ORs? That would be ideal

No it does not, OR would be evaluated first here. AND taking precedence over OR would be more intuitive for me too, however iirc @CyanVoxel suggested this somewhere in the discussion on the discord and I generally don't care too much about it.

(Cyan, If I am not supposed to ping you let me know, I asked on the discord before, but a big discussion broke out right after so I believe you missed it)

CyanVoxel commented 5 days ago

No it does not, OR would be evaluated first here. AND taking precedence over OR would be more intuitive for me too, however iirc @CyanVoxel suggested this somewhere in the discussion on the discord and I generally don't care too much about it.

Unless it was a looooong time ago, I don't remember preferring OR over AND; the opposite seems more intuitive to me as well. I know I brought up a preference for implicit AND when no operator is given, but I'm not sure if that's related here.

(Cyan, If I am not supposed to ping you let me know, I asked on the discord before, but a big discussion broke out right after so I believe you missed it)

It's alright to ping me, I miss too much stuff to warrant not being pinged 🙃

Computerdores commented 5 days ago

No it does not, OR would be evaluated first here. AND taking precedence over OR would be more intuitive for me too, however iirc @CyanVoxel suggested this somewhere in the discussion on the discord and I generally don't care too much about it.

Unless it was a looooong time ago, I don't remember preferring OR over AND; the opposite seems more intuitive to me as well. I know I brought up a preference for implicit AND when no operator is given, but I'm not sure if that's related here.

Alright nvm then. I have a first go at the parser almost done (still with OR preference though) so I might open a Draft PR in the next days.

Computerdores commented 5 days ago

Updated Grammar (AND now binds stronger than OR as is normal):

ORList         = ANDList ( "OR", ANDList)* ;
ANDList        = Term ( ["AND"] Term )* ;
Term           = Constraint | "(", ORList, ")" ;

Constraint     = [ConstraintType, ":"], Literal, "[", PropertyList, "]" ;

ConstraintType = "tag" | "mediaType" ; (* not a complete list *)
PropertyList   = Property, (",", Property)* ;
Property       = ULITERAL, "=", Literal ;
Literal        = ULITERAL | QLITERAL ;

Notes:

Computerdores commented 4 days ago

With this proposal we are currently missing a way to search for untagged files.

My idea on how to fix this would be the following syntax: special:untagged.

The advantage of this is that this could also be used to add other special criteria like e.g. special:unlinked to search for unlinked entries. And it would nicely integrate into the grammar and the existing code base on #606

CyanVoxel commented 4 days ago

With this proposal we are currently missing a way to search for untagged files.

My idea on how to fix this would be the following syntax: special:untagged.

The advantage of this is that this could also be used to add other special criteria like e.g. special:unlinked to search for unlinked entries. And it would nicely integrate into the grammar and the existing code base on #606

I like this approach. Simply typing "empty" or "untagged" was nice in 9.4, but realistically it shadowed any potential tags with those names. This new approach avoids that issue while also making sure it plays nicely with the grammar 👍

python357-1 commented 4 days ago

Added a table for all current search queries. Let me know if anything needs to be changed