gkunter / coquery

Coquery is a free corpus query tool for linguists, lexicographers, translators, and anybody who wishes to search and analyse a text corpus.
GNU General Public License v3.0
18 stars 4 forks source link

EXTRACT function with question marks breaks incorrigibly #297

Closed gkunter closed 5 years ago

gkunter commented 5 years ago

If the regular expression used in an EXTRACT function contains many question marks because the user didn't think of the regular expression syntax, it fails with an exception. Test case: ^? ??? on the column Query string (which was /? ??? ? / on CMUdict to extract all monosyllabic words that consist of a closed syllable from the dictionary):

Type: sre_constants.error
Message: nothing to repeat at position 6 Error during function call EXTRACT(Query string, '^? ???', 0)
 functionlist.py, line 72: lapply
   functions.py, line 388: evaluate
     functions.py, line 335: evaluate
       functions.py, line 335: <listcomp>
       > source.tell() - here + len(this))

As the function never successfully generates a column, it can't be edited any more, so the error cannot be removed at all.

There are basically two issues at hand:

  1. Erroneous regular expressions should not cause exceptions
  2. Functions that raise exceptions should be removed, or should create empty columns