srid / emanote

Emanate a structured view of your plain-text notes
https://emanote.srid.ca
Other
814 stars 71 forks source link

Full-text search query #102

Open srid opened 3 years ago

srid commented 3 years ago

Now that we have #324, enable access to it from the query feature. The following,

```query
some text
```

... should list all notes containing 'some text'. Inspired by Obsidian.

See https://github.com/srid/emanote/discussions/48#discussioncomment-955314

We could piggyback on https://github.com/EmaApps/emanote/issues/338 to implement this.

srid commented 2 years ago

Not having full-text search (including client-side search) is a dealbreaker for some projects, eg: https://github.com/hercules-ci/flake-parts/issues/31#issuecomment-1141259722

applejag commented 2 years ago

Adding stork, as suggested in https://github.com/EmaApps/emanote/pull/242#issuecomment-1100888677, was surprisingly easy. Made a proof-of-concept just to play around, and it works super well :)

image

Client-side of this is easy. The hard part of course it to get Haskell to talk the same language and make use of the search results during emanote gen and emanote run time.

Porting Stork to Haskell isn't realistic. However their CLI seems capable enough.

Stork always rebuild the index from scratch, which is bad for bigger sites. However just to get a perspective, here's some quick benchmarks:

Site Indexed files Search terms stork build time
https://emanote.srid.ca/ 30 5,597 ~0.1s
https://input-output-hk.github.io/adrestia/ 94 35,721 ~0.8 to 1s
https://chenghaomou.github.io/ 405 56,762 ~2s

The index-building times are really good, even for the bigger repos.

And if you lock in to the idea of using Stork, then adding at least statically-built search results is a great start, and would deserve a separate ticket.

My idea of how search support would be added to Emanote:

Sending pages to Stork during emanote run could be done by generating a big temporary TOML file with the content embedded in it, instead of having to sync the files to .html files all of the time, as emanote run keeps it all in memory (if I understand it correctly).

Other ideas:

These are just my two cents. What are your thoughts, @srid ? Maybe this was your plan all along?

srid commented 2 years ago

And if you lock in to the idea of using Stork, then adding at least statically-built search results is a great start, and would deserve a separate ticket.

I agree, and this is what we should do first (without worrying about the query stuff).

Adding stork, as suggested in https://github.com/EmaApps/emanote/pull/242#issuecomment-1100888677, was surprisingly easy.

Could you share how you did this? I imagine we can make emanote gen do it automatically.

By the way, the which library can be used to include stork as part of Emanote install.

srid commented 2 years ago

Separate ticket opened: https://github.com/EmaApps/emanote/issues/324

Let's continue the discussion there.

applejag commented 2 years ago

Could you share how you did this? I imagine we can make emanote gen do it automatically.

Yea sure:

srid commented 2 years ago

We have client-side full text search now, but to integrate it with the query feature we will need #338

applejag commented 2 years ago

We have client-side full text search now, but to integrate it with the query feature we will need #338

Using stork search -i stork.st -q "query goes here" CLI would suffice, and would probably be much easier to implement.

Using FFI could improve performance as it would skip translating back and forth between JSON, so suggest keeping it as a possible future enhancement. But the low-hanging fruit is just to use the CLI as you are when building the index.