UnitexGramLab / gramlab-ide

Unitex/GramLab Java IDE
https://unitexgramlab.org
Other
13 stars 27 forks source link

Search fails on query like a:b #173

Open eric-laporte opened 8 months ago

eric-laporte commented 8 months ago

A search query of the form a:b fails. The interface displays an error message suggesting the semicolon has been mistaken for a marker of subgraph call.

What steps will reproduce the problem?

  1. Open the French 80jours corpus with default configuration
  2. Launch Locate pattern with default configuration and a query of the form a:b, e.g. trois:sienne or jours:6, without spaces before or after the ':' character

What is the expected output?

A "Result info" dialog box should provide the number of occurrences found.

What do you see instead?

An error message is displayed in red: 'regexp: unexpected subgraph call in token_sequence_2_integer_sequence'. The search fails.

More info

martinec commented 8 months ago

Internally, an expression like trois:sienne is converted into a graph:

Thus, the colon : is interpreted as an instruction to call a sub-graph.

For the time being, you can type trois\:sienne as a workaround.

Since this issue is directly related to the use of Locate with Regular Expressions rather than with Graphs, a future enhancement could involve automatically escaping colons before generating the graph representation.