bystrogenomics / bystro

Natural Language Search and Analysis of High Dimensional Genomic Data
Mozilla Public License 2.0
44 stars 14 forks source link

Improve query builder. #49

Open akotlar opened 6 years ago

akotlar commented 6 years ago

In the web app, move from regex to something like PEG/ohm.

akotlar commented 6 years ago

Synonym issue partial fix made. Unfortunately there doesn't seem to be a perfect way of handling this in Pankaj's case, where one synonym is the prefix of another. My current solution avoids the use of word boundaries to also allow symbols like underscores, and checks synonyms in descending order by length to be as greedy as possible.

if (this.synonyms) {
      const synonyms = Object.keys(this.synonyms).sort(
        (a, b) => b.length - a.length
      );

      for (const synonym of synonyms) {
        const val = this.synonyms[synonym];

        query = query.replace(synonym, () => {
          if (Array.isArray(val)) {
            return `( ${val.join(" OR ")} )`;
          }

          return val;
        });
      }
    }