gfredericks / test.chuck

A utility library for test.check
Eclipse Public License 1.0
215 stars 26 forks source link

added charsets for [vVhHR] #3

Closed miner closed 9 years ago

miner commented 9 years ago

I was getting some errors when running the tests so I added a few additional charsets as defined in the Java Doc: http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html

These additions eliminated the errors for me. I'm not entirely sure about all the unicode chars, but I think they're correct.

gfredericks commented 9 years ago

Oh hey it's a Java 8 thing.

I've been meaning to investigate the 7 vs 8 differences but hadn't gotten to that yet.

Not sure what the best approach for supporting both is. If Java 8 regexes are just a superset of Java 7 then it might be workable to just target 8 (I'd just have to figure out how to start testing on 8).

Something that's been a bit awkward so far is my decision to try to accurately reproduce the exception-throwing-characteristics of re-pattern (this test will fail otherwise), and I think that gets more difficult if trying to get tests to pass on java 7 & 8 simultaneously.

One approach is to have less logic in the parser.

gfredericks commented 9 years ago

Thanks for working on this by the way.

miner commented 9 years ago

I am using Java 8. I didn't think to look at any possible differences between Java 7 and 8.

miner commented 9 years ago

I was kind of guessing about how \R was supposed to work. It was just one of those things that popped up as a failure case when I ran the tests. Now that I think about it, maybe you should just mark those charsets as unsupported if they don't work for both Java 7 and 8. My main motivation was just to make sure the tests would pass. :-)

gfredericks commented 9 years ago

I hadn't checked this till now, but according to the recent survey java 7 is still the most used.

Once I figure out the easiest way to start testing with java 8 in parallel I can try to figure out what sorts of other differences there are.

gfredericks commented 9 years ago

Oh I just found \R in the java 8 docs and it describes it about the same way I did. Still don't know if #"[\R]" is valid

gfredericks commented 9 years ago

Finally got a java 8 repl and indeed #"[\R]" is illegal; so I think special-casing \R makes sense (i.e., it's not a character class at all, it's shorthand for \u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029]).

gfredericks commented 9 years ago

I've started working on a branch (called java-8) to support both the Java 7 and Java 8 feature sets in parallel. Actually it's done as far as I know, I just want to run the tests for a day or so to make sure they don't find any other differences.

Let me know if you see any problems with it. As far as I know it should solve your issues.

gfredericks commented 9 years ago

No test failures on my end, so I'm going to go ahead and release the java-8 branch.

gfredericks commented 9 years ago

Should be fixed in 0.1.9.