xPMo / hemppa-bot

Trying out hemppa to make a bot
GNU General Public License v3.0
1 stars 1 forks source link

Search Show Notes #2

Open Southporter opened 2 years ago

Southporter commented 2 years ago

I want to get some feedback, but what do you think about adding a shoe notes search via bot. The community has notes.jupiterbroadcasting.com that already does some searching. Maybe we can hook into that to do lookups.

Possible command structures: !search CoderRadio Crystal !find LUP Manjaro !search nebula vpn

The show hint part could be optional. Looks like the notes site searches all shows by default.

I'm planning on taking a stab at this when I have some time, but wanted to get some discussion going.

xPMo commented 2 years ago

Good thought! The search is a fancy dynamic thing which I don't want to deconstruct, it would be nice if search sharing was enabled and then we could generate that url and parse its output instead.

I don't know if there's a machine-consumable endpoint for mkdocs, I can't find one atm.

Southporter commented 2 years ago

So looking into the mkdocs plugin, it looks like it does provide the search_index at https://notes.jupiterbroadcasting.com/search/search_index.json. That's a pretty bulky json, but we only need to fetch it once a week (the notes site is built every friday). So no easy way to just query an endpoint, but we could build the index cache as part of the module.

I'll create a pr against the notes site to enable search sharing.

xPMo commented 2 years ago

One design paradigm Hemppa uses is that it stores as much state as possible in userdata on the server. Storing the whole search index would be wasteful; pushing it up to the server every time a state change happens locally is not good.

My first opinion for bot command and subcommands:

Using a local cache of the search_index.json:

Note that I'm leaving it flexible enough so that we can support multiple sites. iirc, some of the hosts have public mkdocs instances that we might want to support, or there might be a few projects which are close to the community that we could support searchable docs.

Southporter commented 2 years ago

You bring up a good point. Making it more flexible will allow for things like wiki.selfhosted.show or potentially something like blog.ktz.me

I was already planning on having an !search update command to manually refresh the indexes.

To make sure we are on the same page, are you saying that we should store the search index locally? Would we want to have some sort of TTL on the index and automatically update it after x days?

On Tue, May 17, 2022, 5:44 PM Gamma @.***> wrote:

One design paradigm Hemppa uses is that it stores as much state as possible in userdata on the server. Storing the whole search index would be wasteful; pushing it up to the server every time a state change happens locally is not good.

My first opinion for bot command and subcommands:

Using a local cache of the search_index.json:

  • !mkdocs add ... sets the url to pull from for a given name.
  • !mkdocs pull pulls the newest search_index for .
  • !mkdocs search does a search against the site . Using the alias system, we can alias !jbnotes/ !notes or similar to !mkdocs search jbnotes

Note that I'm leaving it flexible enough so that we can support multiple sites. iirc, some of the hosts have public mkdocs instances that we might want to support, or there might be a few projects which are close to the community that we could support searchable docs.

— Reply to this email directly, view it on GitHub https://github.com/xPMo/hemppa-bot/issues/2#issuecomment-1129392714, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACBQBILBYXNEJYWYEU544CDVKQOOLANCNFSM5WDJ5TSA . You are receiving this because you authored the thread.Message ID: @.***>

xPMo commented 2 years ago

To make sure we are on the same page, are you saying that we should store the search index locally?

Yes. This is assuming we go with the "download the giant json and just search it" method.

Would we want to have some sort of TTL on the index and automatically update it after x days?

The laziest approach is to have a command to manually do it, and then use the existing !cron module to update it regularly.

Southporter commented 2 years ago

@xPMo I've got a PoC put together, what have you been doing to test? I saw your update to the Readme about the .env file. I guess I can create a private room and create a user for my testing, unless you have an easier route.

xPMo commented 2 years ago

I've been testing on prod so far, but we should change that. Check vranki/hemppa for more info on getting an access token.