When using a condition that matches a string with no trailing whitespace, such as match(/\S(.*\S)*/) or match(/\S([^\n]*\S)*/), if the primary_dt_condition_func using re.fullmatch is run over a string that has trailing whitespace (does not match), the process hangs here: https://github.com/jamesaoverton/cmi-pb-terminology/blob/next/src/script/validate.py#L357
I believe this is due to catastrophic backtracking and not an issue with the validation code, but I am unable to load datasets with invalid matches because of this.
In the meantime, I can use exclude(/^\s+|\s+$/) to fit my use case but it would be good to implement a workaround, as this may happen with other patterns.
When using a condition that matches a string with no trailing whitespace, such as
match(/\S(.*\S)*/)
ormatch(/\S([^\n]*\S)*/)
, if theprimary_dt_condition_func
usingre.fullmatch
is run over a string that has trailing whitespace (does not match), the process hangs here: https://github.com/jamesaoverton/cmi-pb-terminology/blob/next/src/script/validate.py#L357I believe this is due to catastrophic backtracking and not an issue with the validation code, but I am unable to load datasets with invalid matches because of this.
In the meantime, I can use
exclude(/^\s+|\s+$/)
to fit my use case but it would be good to implement a workaround, as this may happen with other patterns.