MarkBind / markbind

MarkBind is a tool for generating content-heavy websites from source files in Markdown format
https://markbind.org/
MIT License
135 stars 124 forks source link

Auto-search content of pages #205

Open damithc opened 6 years ago

damithc commented 6 years ago

Current: only page titles and specified keywords in the frontmatter appear in search results.

Suggested: also include other content in pages for search results

damithc commented 5 years ago

Raising priority as full-text search can greatly enhance the usefulness of a content-heavy website.

I don't mind full-text search is a separate page altogether and takes some time to load (i.e., if the full search index needs to be downloaded to the Browser first)

amad-person commented 5 years ago

Are we open to integrating existing solutions for full-text search?

Docsearch (free, open-source):

DocSearch will crawl your documentation website, push its content to an Algolia index, and allow you to add a dropdown search menu for your users to find relevant content in no time.

damithc commented 5 years ago

Are we open to integrating existing solutions for full-text search?

Ideally, we should have a decent built-in solution and the ability to integrate other third-party solutions.

yamgent commented 5 years ago

As discussed with @marvinchin today, Marvin is planning to explore using the Lunrjs library to implement a built-in full text search. This library is also used by MkDocs.

marvinchin commented 5 years ago

Are we still looking to have built-in full text search for V2? 😅 I'm not sure that I can finish it by the end of the semester.

damithc commented 5 years ago

Are we still looking to have built-in full text search for V2? 😅 I'm not sure that I can finish it by the end of the semester.

Good to have, but not necessary. Same for the FOUC problem. Both have a good-enough workaround but not a full-fledged solution.

ang-zeyu commented 2 years ago

I've just published an almost year long project originally motivated by this issue:

It consists of a cli file indexer (integratable by copying the binary similar to what we do for plantuml.jar), a search library powered by wasm (rust), and search ui (typescript).

It deals with the issue in 2 aspects:

Haven't really marketed it as I'm still tying up some things (e2e tests, getting windows defender to stop flagging the executables as viruses, some more bugs), but could look into integrating it here sometime 😃.

damithc commented 2 years ago

I've just published an almost year long project originally motivated by this issue:

Nice work @ang-zeyu Let's aim to integrate it to MarkBind in due course.

damithc commented 2 years ago

I'm increasing the priority because Algolia DocSearch is undergoing a major revamp and they haven't been able to provide the search support for our module websites this semester so far. The sooner we reduce reliance on third-party search the better.

ang-zeyu commented 1 year ago

If anyone would like to take up this issue, please feel free, I think this would be a rather fun thing to do. The library I mentioned above is more or less ready for use. I am currently just doing a fun infinite loop of "making it better and more marketable" but not actually doing any marketing 🤔😅

I came across several related alternatives as well in the course of doing this as well you can consider. All of them follow a CLI + wasm frontend architecture:

Please don't let my selling here from stop you from exercising your own judgement as well. Feel free to come to your own reasoning, and choice, and post back here. I would love to hear your thoughts.

ang-zeyu commented 1 year ago

Some non exhaustive guidelines for implementation:

jingting1412 commented 7 months ago

Hello I've been looking at this issue and one problem I've encountered is how contents in components that are hidden to the user during the initial render (e.g. Panels) are not included in the search results. This is because libraries like Pagefind indexes the content only after the HTML files have been built. This rendering problem is also faced by other plugins like dataTable (@Tim-Siu) and Mermaid (@yiwen101 @LamJiuFong)

This behaviour is also similar to the Algolia DocSearch we use now that automatically adds algolia-no-index to content hidden by MarkBind's Vue components, causing content hidden in panels to similarly not show up in search results.

With this in mind, I'm just making sure if the behaviour of the results of the full text search we want to implement should include content that are included in panels, or it is ok for them to not show up in the search results

damithc commented 7 months ago

This behaviour is also similar to the Algolia DocSearch we use now that automatically adds algolia-no-index to content hidden by MarkBind's Vue components, causing content hidden in panels to similarly not show up in search results.

With this in mind, I'm just making sure if the behaviour of the results of the full text search we want to implement should include content that are included in panels, or it is ok for them to not show up in the search results

@jingting1412 I think it is fine (even necessary) to omit content from collapsed panels. But we can index content from expanded-by-default panels, right?