tom-lord / regexp-examples

Generate strings that match a given regular expression
MIT License
521 stars 31 forks source link

Randomness regression? #21

Closed cabo closed 6 years ago

cabo commented 6 years ago

The CDDL tool uses regex-examples as part of its instance-example generation.

A typical regex might turn up in the CDDL expression

nai = tstr .regexp "\\w+@\\w+(\\.\\w+)+"

A while ago, the CDDL tool generated

"N1@CH57HF.4Znqe0.dYJRN.igjf"

from that. Now I get instances such as

"3l9FP@dHYj37.bcdac.a.a.a.a"
"1caU3X@zJ1.aeeeba"
"UZY@AZU.beecea.a.a"
"jDQFpQ@6iW.ccbbcc.a"
"kdP7X@9jPaW.ccdbc.a"
"Q@4.aabeb.a.a.a.a"
"Hc@8Ts2t.aaccc"
"wK9S@5yMKl.ccaeea.a"
"U1W4WN@wDzHUH.beedca.a.a.a.a.a"
"0Fd8Mn@3FuVVy.adbeda.a.a"

Obviously, this is less satisfying as a set of examples.

Any reason why the entropy vanishes at the end of the RE?

(The same is true with

nai = tstr .regexp "[A-Za-z0-9]+@[A-Za-z0-9]+(\\.[A-Za-z0-9]+)+"

except that in this case the upper case wins:

"p1zG@na.CDABD"
"f9OHe@3kTD4.CCEEDC.A.A"
"hOeIxN@v5h.DAAAB.A.A.A"
"t7q@oEdreG.CCBAC.A"
"to6@HYu.ABADB.A.A.A.A.A"
"B1Qv@ujnEqZ.EBBEA.A.A.A"
"Uf@l4kv.CEEBBC.A.A.A"
"js@6P.BDDE.A.A.A.A"
"r5Ot1K@9c.ADCECB"
"t@GA4.BBCDDB.A.A"

)

tom-lord commented 6 years ago

Interesting, thanks for the report... I'll take a look into this ASAP. At a guess, I suspect it's got something to do with the max_results feature that was added to v1.2.0 of the gem.

Writing tests to prevent such a regression is tricky (I fixed a more subtle issue back in v1.1.3), but I'll have think about how to improve the suite.

tom-lord commented 6 years ago

Sorry it took me absolutely ages to get round to fixing this... The project's not dead; I'm just a busy guy 😅

Gem version v1.4.3 is now released with the fix; thanks again for the report 😄