comcode-org / hackmud_wiki

https://wiki.hackmud.com/
Other
13 stars 21 forks source link

There is no way to search the wiki #74

Closed danswann closed 1 month ago

danswann commented 8 months ago

Problem

Users can't search the wiki. The only way to get to different pages is through site navigation.

Context

Docusaurus supports Algolia DocSearch out of the box. There are other search options available with Docusaurus support, both hosted and client-side, but Algolia is by far the most widely used.

Owners of technical sites can apply to Algolia for free, and they will index your site and provide you with an API key to plug in to the Docusaurus config.

There are a few requirements before applying. Most relevant to us is this one:

Your website must be production ready. We won't index empty websites nor those filled with placeholder content. Please, wait until you have written some documentation before applying. We would be happy to help you as soon as you have a steady design.

I'd argue we don't meet that requirement yet, but @seanmakesgames says he has already been in contact with someone at Algolia and might know more.

In any case, I think once we finish P009 we will have sufficient content to qualify.

hecksadecimal commented 4 months ago

An alternative approach would be text vectorization and lookup, which would keep the infrastructure in ComCode's control. There are plenty of nice, free, and self-hosted vector databases these days. As pages are added, deleted, and edited here a github action could create the vectors and push them to that vector database.

This issue is particularly important to me, I'm building a chatbot for hackmud that will be able to answer questions by scouring the wiki for answers. As it currently stands I would have to do some particularly annoying scraping to get that done, or automate searches against this repo instead.

danswann commented 4 months ago

This is hosted on GH Pages, so unless the vector database is running in every client's browser, that would require additional infrastructure.

Could add a sitemap at least, if that would support your scraping efforts. EDIT: sike, I just realised it's more you're looking for text search and less a discoverability issue.

hecksadecimal commented 4 months ago

A sitemap would still help immensely. I can scrape the pages far easier and build my own embeddings database for my bot.

danswann commented 4 months ago

A sitemap would still help immensely.

Cool, tracking this in #436. Should be able to bang it out quick if we get the go-ahead.

danswann commented 4 months ago

Re: search,

@seanmakesgames I think we have enough content now to apply for Algolia Docsearch, with or without your contact. Do you want to handle that personally on behalf of ComCODE, or is anyone free to apply for us?

seanmakesgames commented 4 months ago

Re: search,

@seanmakesgames I think we have enough content now to apply for Algolia Docsearch, with or without your contact. Do you want to handle that personally on behalf of ComCODE, or is anyone free to apply for us?

Go ham-- I would love this. We can always upgrade our stuff after the fact.

danswann commented 4 months ago

Applied, awaiting approval.

danswann commented 4 months ago

Approved, ready to be integrated.

seanmakesgames commented 4 months ago

Approved, ready to be integrated.

I assume this is just a standard set of instructions. Do you have a link for the integration instructions you can add to this issue?

danswann commented 4 months ago

Link in main issue body. Specifically "Installation steps when not using @docusaurus/preset-classic"