cheatcode / joystick

A full-stack JavaScript framework for building stable, easy-to-maintain apps and websites.
https://cheatcode.co/joystick
Other
209 stars 11 forks source link

Add a simple full-text search #229

Open rglover opened 1 year ago

rglover commented 1 year ago

Was spacing out and had this thought...it'd be relatively simple to do a lightweight full-text search engine. Did a quick test to prove it and sure enough the basic idea works.

I'd have to think about how to sell it, but for a simple search of posts on something like CheatCode, you could get it done for peanuts in terms of memory/result speed. It wouldn't be a one size fits all or "adapt to any data set" type of solution, but it'd definitely be good enough for most people who want a decent FT search w/o wiring up a separate search server or third-party service.

Blind tests of indexing 1,000,000 records only took ~3-5s on average. Search results with a crappy search algo took 1-2s max (and that's with me foolishly loading the entire index into memory). For example, my extremely naive test:

import fs from 'fs';

const index = [];
const post = {
  _id: 'abc123',
  title: 'How to Download a Zip File in Node.js',
  content: 'In this post, we will learn how to download a file as a .zip archive.',
};

for (let i = 0; i < 300; i += 1) {
  const entries = Object.entries(post);

  for (let i = 0; i < entries.length; i += 1) {
    const [key, value] = entries[i];
    index.push({
      _id: post?._id,
      key,
      tokens: value?.split(' '),
    });
  } 
}

fs.writeFile('posts_index.json', JSON.stringify(index), (error) => {
  if (error) console.warn(error);
});

This is a post 1.0 idea but I think it'd be worth adding to @joystick.js/node as an option as the overwhelming majority of folks won't need a heavy metal search setup. Just something that works and is perceptively fast with a small-ish footprint.

rglover commented 1 year ago

Came up with a cheap way to do this using fuse-js. Basically set up a getter that runs a DB query, creates the index, and runs the query and then drops out of memory. Only gotcha is scaling which could be handled by having a way to define a singular index on the server that routinely updates on a tick (or using an observer on the db query).

This is ideal because you could get a cheap real-time search built in to Joystick. It could even work as a standalone server so you could have something like search.cheatcode.co that could have its memory scaled.

rglover commented 1 month ago

Prefer the package fuzzysort for this, but everything else is the same. Basically you just seed that with your data and then call to it to get the results.

I could see a version of this like:

import joystick, { search } from '@joystick.js/ui';

const MyComponent = joystick.component({
  events: {
    'keyup [name="search"]': async (event = {}, instance = {}) => {
       const search_results = await search('<index_name>', event.target.value); // Second arg is the search query
       instance.set_state({ search_results });
    },
  },
  render: () => {
    return `
      <input type="search" name="search" />
    `;
  },
});

Basically, that would hit an endpoint like /api/_search/index_name and just hit against the pre-defined index.