wellner / jcarafe

BSD 3-Clause "New" or "Revised" License
14 stars 1 forks source link

attributeFn encoded features could conflict #12

Open antonyscerri opened 11 years ago

antonyscerri commented 11 years ago

Looking at the code for attributeFn it would appear that this can introduce a conflicting feature as its simply concatenating the attribute name and value with no intermediate char/string. It also looks as if there is a more general issue where other feature functions could produce collisions, for example wordFn and caselessWdFn could collide as they both could with attributeFn.

antonyscerri commented 11 years ago

Should this be an issue, backward compatability could be preserver by making each function have a stricter encoding method than the norm which introduced unique prefix/suffix per function and/or included additional boundary separators between constituents used as in the case of the attributeFn.

This is only going to be necessary if collisions of this form do occur frequently enough to cause any problems.