FoundatioFx / Foundatio.Parsers

A lucene style query parser that is extensible and allows modifying the query.
https://www.nuget.org/packages/Foundatio.Parsers.LuceneQueries/
Apache License 2.0
66 stars 20 forks source link

Unable to parse query with escaped characters #49

Closed jonnermut closed 4 years ago

jonnermut commented 4 years ago

ElasticSearch / Lucene allows escaping characters with a backslash: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html

Feeding queries with backslashed characters in into a ElasticQueryParser them fails to parse with an exception. The same query string works when used in ElasticSearch directly. Test case below:

using System.Diagnostics;
using System.Threading.Tasks;
using Foundatio.Parsers.ElasticQueries;
using Foundatio.Parsers.ElasticQueries.Visitors;
using Foundatio.Parsers.LuceneQueries.Visitors;

namespace QueryEscapingTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var simple = ParseAndRewrite("normal").Result;
            Debug.Assert(simple == "normal");

            var escaped = ParseAndRewrite("\\\"escaped").Result;
            Debug.Assert(escaped == "\\\"escaped");
        }

        static async Task<string> ParseAndRewrite(string query)
        {
            QueryFieldResolver resolver = (field) =>
            {
                return field;
            };

            ElasticQueryVisitorContext context = new ElasticQueryVisitorContext { QueryType = QueryType.Query };

            var parser = new ElasticQueryParser(conf => 
                conf.UseFieldResolver(resolver)
                    .UseValidation(async info => true)
            );

            var queryNode = await parser.ParseAsync(query, context);
            return queryNode.ToString();
        }

    }
}

Project file: QueryEscapingTest.zip

ejsmith commented 4 years ago

@jonnermut this should be fixed now. Thanks for reporting!

jonnermut commented 4 years ago

Awesome, thanks!

InsomniumBR commented 1 month ago

@ejsmith could you let me know if I am doing something wrong according to the Lucene syntax here? I grabbed the 4th line from the original bug and just moved the backslash to the end of the string. The first 3 lines relate to what I was doing before seeing this old bug.

            var parser = new LuceneQueryParser();
            _ = parser.Parse("a:123"); // OK
            _ = parser.Parse("a:\"123\""); // OK
            _ = parser.Parse("a:\"123\\\""); // Exception: Unterminated quoted string

            _ = parser.Parse("\\\"escaped");  // OK
            _ = parser.Parse("\"escaped\\"); // Exception: Unterminated quoted string

Thank you

niemyjski commented 1 month ago

Would you mind creating a new issue for this @InsomniumBR. The first one looks like it could be a bug, the last one I'm wondering how does that even compile. If you could add a pr or updates to the existing test in a snippet that would be a huge help.

avillarrealRW commented 1 week ago

Hello @niemyjski, I'm a coworker of @InsomniumBR and was helping out with adding some tests based on the issues we were running into. Here is the PR that I created for that, thank you! https://github.com/FoundatioFx/Foundatio.Parsers/pull/92