bripkens / lucene

Node.js lib to transform: lucene query → syntax tree → lucene query
MIT License
72 stars 33 forks source link

Support for lower case operators #37

Closed DenisaSurdu closed 3 years ago

DenisaSurdu commented 3 years ago

I'm using the library to allow for free text search and if I do: a OR b

the tree builds up with "operator": "OR", like this one

{
   "left": {
      "field": "<implicit>",
      "fieldLocation": null,
      "term": "a",
      "quoted": false,
      "regex": false,
      "termLocation": {
         "start": {
            "offset": 0,
            "line": 1,
            "column": 1
         },
         "end": {
            "offset": 2,
            "line": 1,
            "column": 3
         }
      },
      "similarity": null,
      "boost": null,
      "prefix": null
   },
   "operator": "OR",
   "right": {
      "field": "<implicit>",
      "fieldLocation": null,
      "term": "b",
      "quoted": false,
      "regex": false,
      "termLocation": {
         "start": {
            "offset": 5,
            "line": 1,
            "column": 6
         },
         "end": {
            "offset": 6,
            "line": 1,
            "column": 7
         }
      },
      "similarity": null,
      "boost": null,
      "prefix": null
   }
}

However if I have a or b, the operator is implicit ("operator": "<implicit>",) as can be seen in this tree:

{
   "left": {
      "field": "<implicit>",
      "fieldLocation": null,
      "term": "a",
      "quoted": false,
      "regex": false,
      "termLocation": {
         "start": {
            "offset": 0,
            "line": 1,
            "column": 1
         },
         "end": {
            "offset": 2,
            "line": 1,
            "column": 3
         }
      },
      "similarity": null,
      "boost": null,
      "prefix": null
   },
   "operator": "<implicit>",
   "right": {
      "left": {
         "field": "<implicit>",
         "fieldLocation": null,
         "term": "or",
         "quoted": false,
         "regex": false,
         "termLocation": {
            "start": {
               "offset": 2,
               "line": 1,
               "column": 3
            },
            "end": {
               "offset": 5,
               "line": 1,
               "column": 6
            }
         },
         "similarity": null,
         "boost": null,
         "prefix": null
      },
      "operator": "<implicit>",
      "right": {
         "field": "<implicit>",
         "fieldLocation": null,
         "term": "b",
         "quoted": false,
         "regex": false,
         "termLocation": {
            "start": {

Would be nice to have a way of expanding the grammar and therefore the parser to contain the lower case equivalents of the operators.

bripkens commented 3 years ago

Hello @DenisaSurdu,

in the lucene query language upper / lower case has specific meaning. I am therefore afraid that I'll have to close this issue.

Also see this ticket comment which shows a reference from the official JVM lucene parser.