hoelzro / tw-full-text-search

Full text search plugin for TiddlyWiki powered by lunr.js
https://hoelz.ro/files/fts.html
Other
25 stars 4 forks source link

Full support of lunr's additional features #5

Closed diego898 closed 5 years ago

diego898 commented 6 years ago

Hey @hoelzro,

I love this plugin, and want it to become part of the core of TW eventually! Jeremy indicated he is also looking at lunr-based solutions.

In line with that, it would be awesome if this plugin could fully support the following features lunr also supports:

I recognize that this is a lot! I just figured I would report back the results of my testing in this "wish list"!

Side-note: What do you think of the relationship between this library and filtering?

hoelzro commented 6 years ago

Hi @diego898,

Thanks for your kind words! Here are my thoughts on the features you brought up:

I'll add these to my personal feature tracker for this project (which I should perhaps make public!)

Regarding filtering, did you have anything particular in mind? The plugin provides an undocumented ftsearch filter that you can use, but it's a little stupid at the moment (eg. it assumes you have a build index). What kinds of things would you like to do with this library and filters?

diego898 commented 6 years ago

Hey @hoelzro, as I understand it from reading:

https://lunrjs.com/guides/searching.html

wildcards and term presence are already implemented in the library. They just need to be exposed in TW.

Term Presence is basically lunr's AND. So a term MUST be present if it has a + and must NOT be present if it has a -

hoelzro commented 6 years ago

Ah, ok - is term presence a new feature? Since I'm just calling lunr's query function, you should be able to just type in the + character for it to work - I probably need to update lunr.js, though.

diego898 commented 6 years ago

Im not sure if its new, but when I tried it with +customize nothing came up.

NOTE: I forgot to mention above, all of the tests are down on a local copy of TW with this plugin installed.

diego898 commented 6 years ago

My comment about filtering was more an observation, that this search plugin is very powerful and could replace/augment many things currently done with filtering in TW

hoelzro commented 6 years ago

I just checked - I've been using lunr.js 2.1.4, whereas term presence was added to 2.2.0. I'll make an issue to add this!

hoelzro commented 6 years ago

Regarding filtering, do you think the existing ftsearch filter would be good enough, or should I make additional filters?

diego898 commented 6 years ago

Sorry Rob, what is ftsearch? I was just observing that right now in TW (without this plugin) I use a lot of filtering, things like: [tag[testing]regexp[mystring]], etc. This search plugin lets you also achieve many of the same goals that someone would use filtering for.

hoelzro commented 6 years ago

@diego898 Oh, sorry! ftsearch is the filter that this plugin provides that basically powers the whole thing.

BTW, do you have any interest/time in contributing code to this project? If so, I can always add you as a contributor!

diego898 commented 6 years ago

I do indeed, but I dont know 1. much javascript 2. enough about TW to figure out how they cleanly interact. I've been studying your plugin to see if I can indeed make changes!

hoelzro commented 6 years ago

@diego898 Ok, that's understandable. Any code contribution is most welcome, but no pressure - just the feedback you've been giving me so far has been great!

diego898 commented 6 years ago

Hey @hoelzro Im seeing some bugs in the latest version.

image

image

image

image

as you can see, the formatting of results is strange, and Im not sure fuzzy matching is working correctly.

hoelzro commented 6 years ago

Thanks for the report @diego898 - I'll have a look!

hoelzro commented 6 years ago

@diego898 Is this just the wiki from tiddlywiki.com with the FTS plugin added? I just tried that and it looks ok; would it be possible to share the wiki you're seeing this on?

diego898 commented 6 years ago

Hey @hoelzro, yeah I just installed the latest version into a copy of TW.com Are you not seeing what I see in the screenshots?

hoelzro commented 6 years ago

No, I'm not =( What browser are you using? Could you send me the HTML file you're working with?

Also, regarding the fuzzy matching, what results are you seeing that you didn't expect to, or what's missing from the results that you expected to see?

diego898 commented 6 years ago

Hey @hoelzro how do I send it to you?

For the fuzzy matching, I was searching for "learning", but some of the first results that came up didn't have anything close to that word as far as I could tell? (maybe Im wrong?)

Also, the resutls are all displayed in a single line

Also when searching for "tiddly" why did it only bring up one result?

Also for title:tid*l why did it not bring up everything with tiddly in the title?

hoelzro commented 6 years ago

@diego898 You should be able to attach a file to this issue (there's a little blurb describing how at the bottom of the reply box) - if that doesn't work, you can send it to my e-mail address, which you can find on my site, https://hoelz.ro.

By the way, which browser are you using? Do you have any extensions enabled in that browser?

I don't see that single line behavior when I import the plugin into https://tiddlywiki.com; if you can get me that HTML file, I can dig in and try to figure out what's going on.

As far as the search results go, let's handle the display issues first and then we can move on to those; I have a feeling it has something to do with how the index is built.

hoelzro commented 6 years ago

Thanks for sending me the file @diego898 - I realized right away what the problem was! I didn't notice that you were using the advanced search; that's a bug I've had for a while but completely forgot about!

hoelzro commented 6 years ago

I've filed that away as #8.

diego898 commented 6 years ago

@hoelzro ah I see. Sorry, should have made that clearer! Should I repost my results under regular search?

hoelzro commented 6 years ago

@diego898 Do regular search results render the same way for you, or do those look ok?

diego898 commented 6 years ago

They look fine on regular search

hoelzro commented 6 years ago

Ok, good - I'll whip up a patch within the next few days!

diego898 commented 6 years ago

hey @hoelzro, have you noticed the new lunr/fuzzy search plugin on google groups? What do you think? Im not sure why your plugin hasnt caused the fan fare on GG that that plugn did?

hoelzro commented 6 years ago

@diego898 I have - I think it's nice to have alternatives, and IMO fuzzy search and FTS occupy different parts of a similar niche. I didn't have an entirely cold reception when I announced FTS on the group, but I would guess that part of it comes down to @TheDiveO's attention to detail and design. The TwFuseJs demo wiki is much nicer looking than mine! It would also help if I made a release in the near future 😅

Feature-wise, it would be nice if FTS maintained the index entirely without user intervention - maybe that would be good for a new version!

(@TheDiveO sorry for mentioning you out of nowhere, but if you have any ideas on how to improve TW-FTS, I would love to hear them!)

hoelzro commented 6 years ago

@diego898 Regarding those search results for title:tid*l and lerning~2 above - what are you trying to use the plugin to do? I just want to make sure I fully understand. I'm assuming for the former you want to match any tiddler whose title has a word that has "tid", followed by zero or more characters, then "l" in it, and for the latter you only want to match tag:learning even with the spelling mistake?

hoelzro commented 6 years ago

@diego898 ping

diego898 commented 6 years ago

Hey @hoelzro sorry for the delay - Im trying to use title:tid*l to find all tiddlers whose title contain the string tid---ANY LENGTH INCLUDING ZERO CHARACTERS---l in their title, as you said. For lerning~2 should include all tiddlers with the text "learning" inside them (not just the tag?)

hoelzro commented 6 years ago

@diego898 No worries - I just want to make sure I fully understand what you want so I can deliver! Thanks for clarifying - by that logic, title:sett*le should match tiddlers with settle in the title, right?

diego898 commented 6 years ago

yup!

hoelzro commented 6 years ago

Ok - so I think that's going to be a problem. =( Since lunr.js stems everything, we don't keep settle in the database anywhere - it gets reduced to settl. I'm not sure how lunr's author intended wildcards to handle this; I'm going to do a little research and see if there's a recommended solution!

hoelzro commented 5 years ago

@diego898 I'm sorry it's taken so long, but I finally released version 1.1.0 of the plugin! It contains a lot of improvements around wildcard and fuzzy searching, so please give it a shot if you have time and let me know what you think!

diego898 commented 5 years ago

awesome rob! Ill give it a try! thanks for your continued work on this!

diego898 commented 5 years ago

Hey @hoelzro , In testing I enabled wildcard and fuzzy, auto generating, and regenerated my index. How come if I have a tiddler with title "... re-order ..." and I search for reorder~1 nothing comes up?

hoelzro commented 5 years ago

@diego898 Thanks for trying it out - I'll look into this!

hoelzro commented 5 years ago

I see what's happening - re-order is being tokenized by lunr into re and order. reorder~1 would match re-order, but since lunr operates entirely on a single token level, re-order isn't in the index at all. I'm wondering if the tokenizer should be tweaked to allow for hyphenated words :thinking:

diego898 commented 5 years ago

Thanks for figuring this out! By the way, do you think this will make it into the TW core? Also, should we close this issue now?

hoelzro commented 5 years ago

Regarding the TW core - I have a feeling it's a little too "heavy" for core, but that's ultimately up to Jeremy.

Regarding the issue, if you feel like it can be closed, we can close it! I think I'm tracking any issues reported here elsewhere, so I have no objections!

diego898 commented 5 years ago

Ok thanks Rob!