justincampbell / generative

Generative/property-based testing for RSpec
MIT License
69 stars 7 forks source link

Random data generation #8

Open justincampbell opened 10 years ago

justincampbell commented 10 years ago

Using this issue to compile thoughts on this

To satisfy these last two, I think I like the idea of having a generate method, which takes a symbol and some other options.

I've been thinking about more functionally applicative ways to accomplish data generation. The best example I've found is the Rantly library:

Rantly { array(10) { sized(5) { string } } }

This is powerful, but I don't find it very intuitive for onboarding people to this process, and I dislike the possibility of name clashes and the uncertainty around that. I'm going to pursue a similar style, but with a single generate method:

generate(:array, size: 10) { generate :string, size: 5 }

This wil be a separate library from Generative, and Generative will require it, but I'm happy to include in our documentation how to include and use other libraries such as Rantly.

/cc @dmcclory @jessitron #3

nessamurmur commented 10 years ago

@justincampbell I really like that you didn't try to tackle generating random data first in generative... I could see someone working on your typical Ruby web app getting really far just using this + something like random_data + FactoryGirl.

I really like these two points you made:

  • Generative should give the user data generation capability out of the box
  • Generative should require another gem to accomplish this, as data generation and generative testing are not dependent upon one-another

What do you think about generative never providing anything at all, but having a gem that provides:

  1. Generation of primatives
  2. Registration for new generators

i.e. If someone did want to use @dmcclory's idea for #3 almost a year ago of using a type mapper of some kind they use. If someone wanted to actually rely on FactoryGirl they could. Registration would just let them tie their generator (which would just be something that responds to #call) to a specific symbol for the generate method.

This may be the direction you were already thinking, not sure... Here's some random half-baked examples of some different styles people might try.

Generative::Generator.register(:full_name, ->() { "#{generate(:string)} #{generate(:string}" }
generate(:full_name) # => "joiemdicahdlmc omja"
class Foo < TypeMapping

  # where Bar is a class
  generator_for Bar do 
    initializer Float, Fixnum

    accessor :baz, Float
    accessor :quz, NotherClass
  end
end

Generative::Generator.register(:bar, -> () { Bar.generate }
class LameGenerator
  include Generative::Primatives
  def self.call
    random_string
  end
end

Generative::Generator.register(:lame, LameGenerator)
# in some generic Rails app
Generative::Generator.register(:user, FactorGirl.build(:user, id: generate(:integer)))
nessamurmur commented 10 years ago

Turning this into a checklist:

nessamurmur commented 10 years ago

Started on a gem to generate random data: https://github.com/levionessa/degenerate

arronmabrey commented 9 years ago

One thing to consider, is test determinism and repeatability.

As I'm sure you're aware, RSpec ships with a feature that randomizes the test execution order, to expose hidden interdependencies within a test suite.

While this is clearly not the same problem Generative aims to solve. I do think everyone agrees that the basic concept of adding randomness to our testing, to expose hidden defects is a good one.

The one thing I want to point out, and I hope everyone takes note of, is this RSpec feature would be very difficult to use if it weren't for the ability to specify a --seed value to achieve test determinism and repeatability.

This is crucial when a real bug is exposed though randomness introduced into the test, as it allows you to get reproducible errors by running the exact same tests with the same values.

Here is a link to the RSepc docs on the subject "randomization-can-be-reproduced-across-test-runs"

nessamurmur commented 9 years ago

@arronmabrey thanks for the feedback. I'm definitely not opposed to this... I would say though in other generative testing libraries in other languages I don't have that and it doesn't really matter much. Are you familiar with shrinking in generative testing?

Usually what I do when I hit a failure is I take the shrunk example and write an example based test using the shrunk example so I can work with it for a while. If I wasn't sure why the example was failing when I started I usually figure it out and keep this test around to make sure there aren't any regressions that re-introduce that edge case.

I don't see this as a reason not to ensure that you can get repeatable generated examples... but I do see shrinking as a priority over that atm... Haven't quite gotten around to it. I'd happily take PRs that work towards either of these two things.

arronmabrey commented 9 years ago

Actually this Just Works.

This is because when you use the command-line option rspec --seed 1234 there is a line of code Kernel.srand config.seed, in the default spec_helper.rb file.

This causes Kernel.srand to be set to 1234, and everything thereafter that uses ruby's Pseudorandom number generator Random::DEFAULT will get deterministic results.

When I wrote my first comment, I was not getting this behavior and assumed this was the libraries fault (sorry).

Turns out there is a bug in RSpec 3.1.x whereby if you have the line --require spec_helper in your .rspec file. This will cause Kernel.srand config.seed to be evaluated, prior to considering the rspec --seed 1234 option passed on the command-line.

This bug looks to be fixed in the next release of RSpec, I'm guessing v3.2.0. My short-term fix was to remove --require spec_helper from my .rspec file.

nessamurmur commented 9 years ago

Cool. Haven't been watching for it in Degenerate so let me know if you notice any random data not working out this way.

nessamurmur commented 9 years ago

Just want to track some thoughts...

Getting ready soon to implement something for shrinking and might take a stab at more composable generators...

I noticed Rantly is doing roughly what I was thinking of for shrinking: https://github.com/hayeah/rantly/blob/master/lib/rantly/shrinks.rb

nessamurmur commented 9 years ago

More notes... seems like the part that's like rantly is straight forward enough... the tricker part of me is how to hijack the running of an example and rescue from a failure by shrinking and then failing with the shrunk data... Overriding https://github.com/rspec/rspec-core/blob/master/lib/rspec/core/example.rb#L189-L237 seems like a bad idea... not coming up with another path yet...

justincampbell commented 9 years ago

@levionessa yeah, I also couldn't find a way to access the let/data from the failure formatter. We could change data to store a reference somewhere, but I guess that's not much use if we can't re-run a spec after shrinking.

nessamurmur commented 9 years ago

@justincampbell some half-baked ideas I haven't really explored yet:

  1. Create a subclass of RSpec::Core::Example. Something like GenerativeExample and write something like an ExampleGroup.check method that would replace it... Not partial to this idea... Haven't explored it very far but thinking it'd require too much knowledge of RSpec from Generative...
  2. Follow Rantly's example and instead ignore example groups and example altogether, using some check method inside the block that gets passed to it... Haven't fully thought out the consequences but beyond being a definite breaking change it will probably make things like reporting up to RSpec challenging...
nessamurmur commented 9 years ago
  1. Something else involving using a custom method instead of an example group method....