osayouba / opendatakit

Automatically exported from code.google.com/p/opendatakit
0 stars 0 forks source link

regex causes ODK to crash #983

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
The help at http://opendatakit.org/help/form-design/binding warns that complex 
regex patterns may cause stack overflow crashes. But I think I'm encountering a 
different problem with similar symptoms. I have a form that has a repeat to 
solicit any number of 10 digit ID numbers (mwid). I want to check that the ID 
numbers are not duplicated. Which I am doing by (a) having a calculated field 
outside the repeat (mwids) for which the calculation is:
    join(' ',${mwid})
and (b) a required question inside the loop with relevant condition:
    regex(${mwids},concat(${mwid},'.*?',${mwid}))
and constraint false().

This all works, as long as I don't enter more than 5 ID numbers. If I enter 8 
or more ID numbers, it all seems to work until the form is saved. At which 
point, ODK Collect 1.4 (1038) crashes "The application ODK Collect (process 
org.odk.collect.android) has stopped unexpectedly. Please try again.". But 
before it crashes, it has written (in the .cache folder) an .xml.save file 
containing all the data from the current instance. If I enter 6 or 7 numbers, 
ODK sometimes crashes and sometimes doesn't. If I make the ID number 5 digits 
instead of 10 digits, I can enter more ID numbers before the crash occurs on 
saving.

So it looks to me as if it is the length of the string to be searched, rather 
than the complexity of the regex pattern, that is causing the crash, and that 
for some reason, the crash is only induced when ODK Collect re-checks the 
current instance data before saving.

The xlsform definition and catlog output attached (crash happened at 
13:01:34.031).

Original issue reported on code.google.com by james.be...@gmail.com on 12 Mar 2014 at 7:50

Attachments:

GoogleCodeExporter commented 9 years ago
Haven't had a chance to try this on the in-the-works software, but you might 
try this Regex...

Following an idea from 
http://stackoverflow.com/questions/863125/regular-expression-to-count-number-of-
commas-in-a-string

It may be less costly:

regex(${mwids},concat('(?:',${mwid},'.*){2}'))

Original comment by mitchellsundt@gmail.com on 13 Mar 2014 at 4:39

GoogleCodeExporter commented 9 years ago
Thanks for the tip (I didn't know about non-capturing groups before). 
Unfortunately, with the current version, any saving isn't great enough to 
prevent the crash. 

Original comment by james.be...@gmail.com on 21 Mar 2014 at 11:12

GoogleCodeExporter commented 9 years ago
If your id values do not contain spaces, if you are creating a space-separated 
string of values, you can use the selected() function to access the string (and 
the count-selected() or selected-at() functions, too). These all work on string 
fields, and expect a space-separated list of values (no leading or trailing 
spaces).

Original comment by mitchellsundt@gmail.com on 11 Jun 2014 at 8:29

GoogleCodeExporter commented 9 years ago
Brilliant idea. Thank you.

Original comment by james.be...@gmail.com on 16 Jun 2014 at 8:47