benwbrum / fromthepage

FromThePage is a wiki-like application for crowdsourcing transcription of handwritten documents.
http://fromthepage.com
GNU Affero General Public License v3.0
171 stars 51 forks source link

Boolean search/quotations in search strings #2360

Open saracarl opened 3 years ago

saracarl commented 3 years ago

If you type "Mrs. Alice Mann" into a search box you can get results for Mrs. Alice Mann and Miss Alice Mann because we seem to convert the " " to +"Mrs.* +Alice* +Mann"'*

https://www.fromthepage.com/paged_search?action=search&authenticity_token=%2FYx0HfuqMTArO2u6Xiab8XAmYsXNidd%2BySiu%2Btl8HF6ho7UMqHpcIV1jWPpXUxy4YPzBftoDRhvTGTVJBSom7g%3D%3D&button=&collection_id=mount-auburn-cemetery&controller=display&search_string=%22Mrs.+Alice+Mann%22%27

We should also respect any regex a user wants to type in and document what we'll support: https://dev.mysql.com/doc/refman/8.0/en/fulltext-boolean.html

'apple banana'

Returns pages with either apple or banana in them
'+apple +juice'

Find pages that contain both words.
'"apple juice"'
will return just pages with the phrase "apple juice"
benwbrum commented 3 years ago

We need provide some intermediary processing in the search logic to allow advanced users to do boolean or quoted searches, while still supporting the (simpler) logic to add wildcards around terms.

Mount Auburn has requested this, but I think that #1465 and #2533 are higher priorities.