Closed zackw closed 1 year ago
Thanks for the PR 😅 I'll have a look at this more closely when I have time, but upon first inspection it looks sensible.
If I haven't reviewed by the end of this weekend feel free to ping me 😅
Agh thanks for reminding me I need to look at this -- will do it this evening after work
Rebased on latest trunk, sorry for the delay.
Added another commit to fix up the code for an API change. I cannot run the test suite locally right now due to what looks like a bug in message-io...
error[E0716]: temporary value dropped while borrowed
--> .../.cargo/registry/src/index.crates.io-6f17d22bba15001f/message-io-0.17.0/src/adapters/udp.rs:241:22
|
241 | &mut [io::IoSliceMut::new(&mut input_buffer)],
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ creates a temporary value which is freed while still in use
...
244 | );
| - temporary value is freed at the end of this statement
245 |
246 | match result {
| ------ borrow later used here
|
= note: consider using a `let` binding to create a longer lived value
... but this is sufficient to make proptest build again as a dependency of a larger project that uses the feature this PR adds, and that project's testsuite passes.
The additional commit should be squashed before merging.
Fatfingered the close button, sorry for the noise.
It is desirable to be able to generate, from a regex, byte sequences that are not necessarily valid UTF-8. For example, suppose you have a parser that accepts any string generated by the regex
Then, in your test suite, you might want to generate strings from the complementary regular language, which you could do with the regex
However, this will still only generate valid UTF-8 strings. Maybe you are parsing directly from byte sequences read from disk, in which case you want to test the parser’s ability to reject invalid UTF-8 as well as valid UTF-8 but not within the accepted language. Then you want this slight variation:
But this regex will be rejected by
bytes_regex
, because by defaultregex_syntax::Parser
errors out on any regex that potentially matches invalid UTF-8. The application — i.e. proptest — must opt into use of such regexes. This patch makes proptest do just that, forbytes_regex
only.There should be no change to the behavior of any existing test suite, because opting to allow use of
(?-u)
does not change the semantics of any regex that doesn’t contain(?-u)
, and any existing regex that does contain(?-u)
must be incapable of generating invalid UTF-8 for other reasons, orregex_syntax::Parser
would be rejecting it. (For example,(?-u:[a-z])
cannot generate invalid UTF-8.)This patch also adds a bunch of tests for
bytes_regex
, which AFAICT was not being tested at all. Some of these use the new functionality and others don’t. There is quite a bit of code duplication in the test helper functions —do_test
anddo_test_bytes
are almost identical, as aregenerate_values_matching_regex
andgenerate_byte_values_matching_regex
. I am not good enough at generic metaprogramming in Rust to factor out the duplication.