askorama / orama

🌌 Fast, dependency-free, full-text and vector search engine with typo tolerance, filters, facets, stemming, and more. Works with any JavaScript runtime, browser, server, service!
https://docs.orama.com
Other
8.5k stars 282 forks source link

input is not a string type (crash on `remove` method) #66

Closed thomscoder closed 2 years ago

thomscoder commented 2 years ago

Describe the bug The remove method makes Lyra beta_17 crash.

It gives the following error

TypeError: input.toLowerCase is not a function

at

const tokens = input.toLowerCase().split(splitRule);

To Reproduce

const movieDB = create({
  schema: {
    title: 'string',
    director: 'string',
    plot: 'string',
    year: 'number',
    isFavorite: 'boolean'
  }
});

const { id: harryPotter } = insert(movieDB, {
  title: 'Harry Potter and the Philosopher\'s Stone',
  director: 'Chris Columbus',
  plot: 'Harry Potter, an eleven-year-old orphan, discovers that he is a wizard and is invited to study at Hogwarts. Even as he escapes a dreary life and enters a world of magic, he finds trouble awaiting him.',
  year: 2001,
  isFavorite: false
});

remove(movieDB, harryPotter);

Screenshots

Schermata 2022-07-31 alle 00 31 24

Desktop (please complete the following information):

thomscoder commented 2 years ago

EXPANDING: I was writing the tests for this and something very interesting happened. Consider the following example:

const db = create({
  schema: {
    quote: "string",
    author: "string",
  },
});

const {id: tomRiddle} = insert(db, {
  quote: "Harry Potter, the boy who lived, come to die. Avada kedavra.",
  author: "Tom Riddle",
});

remove(db, tomRiddle);

The test passes. It successfully deletes the document! But it fails (with the aforementioned TypeError giving a typeof input == 'object') with this:

const db = create({
  schema: {
    quote: "string",
    author: {
      name: "string",
      surname: "string",
    },
  },
});

const {id: tomRiddle} = insert(db, {
  quote: "Harry Potter, the boy who lived, come to die. Avada kedavra.",
  author: {
    name: "Tom",
    surname: "Riddle",
  },
});

remove(db, tomRiddle);

The interesting thing is in

export function tokenize(input: string, language: Language = "english") {
  const splitRule = splitRegex[language];
  const tokens = input.toLowerCase().split(splitRule); // the error occurs in here
  return Array.from(new Set(trim(tokens)));
}

If the Schema has properties of type !== string, the typeof input is equal to the very first property in the Schema !== string

const movieDB = create({
    schema: {
      title: 'string',
      director: 'string',
      plot: 'string',
      year: 'number',
      isFavorite: 'boolean'
    }
  });

const { id: harryPotter } = insert(movieDB, {
  title: 'Harry Potter and the Philosopher\'s Stone',
  director: 'Chris Columbus',
  plot: 'Harry Potter, an eleven-year-old orphan, discovers that he is a wizard and is invited to study at Hogwarts. Even as he escapes a dreary life and enters a world of magic, he finds trouble awaiting him.',
  year: 2001,
  isFavorite: false
});

remove(movieDB, harryPotter); 

typeof input == 'number'

const movieDB = create({
    schema: {
      title: 'string',
      director: 'string',
      plot: 'string',
      isFavorite: 'boolean'
    }
  });

const { id: harryPotter } = insert(movieDB, {
  title: 'Harry Potter and the Philosopher\'s Stone',
  director: 'Chris Columbus',
  plot: 'Harry Potter, an eleven-year-old orphan, discovers that he is a wizard and is invited to study at Hogwarts. Even as he escapes a dreary life and enters a world of magic, he finds trouble awaiting him.',
  isFavorite: false
});

remove(movieDB, harryPotter);

typeof input == 'boolean'


Hope it helps 😄

micheleriva commented 2 years ago

Thank you @thomscoder!

I fixed this with https://github.com/nearform/lyra/commit/3cb1c75e6e0e16b2e636e39fc494f66bb0900fb6, will be published with the next release.