While working on #14, I found another Python 2.7.5 bug. 40 tests fail while a Unit's template is trying to compile a regular expression, with the error:
error: nothing to compile
Here's an example of a regex that leads to this error:
(?P<parts>(?:[A-Za-z]+|[^A-Za-z0-9]*)+)
From what I understand, the problem is that the expression in the innermost parentheses could lead to a non-match, yet it's followed by a + which requires a match. I think Python versions 2.7.6 and later allow this, while 2.7.5 doesn't. (These tests only fail on 2.7.5.)
All failing tests are for Units AlphaSymbol, NumericSymbol, and AlphaNumericSymbol, all CompoundUnit types that use the simple Formatting unit as a component. I've traced the underlying problem to the fact that the Formatting unit allows 0 or more matches. In most call numbers formatting is treated as optional, so originally this seemed desirable. But, in retrospect, having it match nothing by default seems counterintuitive—and unnecessary, since you can still set formatting components as optional on an individual basis. Changing the default min_length from 0 to 1 does solve the nothing to compile error, but it breaks some of the tests testing the current behavior. Fixing it will be a matter of untangling that web and making sure none of the more complex callnumber types rely on that behavior to function.
While working on #14, I found another Python 2.7.5 bug. 40 tests fail while a Unit's template is trying to compile a regular expression, with the error:
Here's an example of a regex that leads to this error:
From what I understand, the problem is that the expression in the innermost parentheses could lead to a non-match, yet it's followed by a + which requires a match. I think Python versions 2.7.6 and later allow this, while 2.7.5 doesn't. (These tests only fail on 2.7.5.)
All failing tests are for Units
AlphaSymbol
,NumericSymbol
, andAlphaNumericSymbol
, allCompoundUnit
types that use the simpleFormatting
unit as a component. I've traced the underlying problem to the fact that theFormatting
unit allows 0 or more matches. In most call numbers formatting is treated as optional, so originally this seemed desirable. But, in retrospect, having it match nothing by default seems counterintuitive—and unnecessary, since you can still set formatting components as optional on an individual basis. Changing the defaultmin_length
from 0 to 1 does solve thenothing to compile
error, but it breaks some of the tests testing the current behavior. Fixing it will be a matter of untangling that web and making sure none of the more complex callnumber types rely on that behavior to function.