avibryant commented 4 years ago

This closes the loop by giving a toSample to Prediction which does the actual sampling (but still allows modification to the generating function, after sampling, for quick exploration).

This is the minimal change to be able to actually run new-style models. Once this works and is landed I'll go back and revisit the APIs around Sampler, TargetGroup, etc.

codecov-io commented 4 years ago

Codecov Report

Merging #409 into nuke-rv will decrease coverage by 0.16%. The diff coverage is 16.66%.

@@             Coverage Diff             @@
##           nuke-rv     #409      +/-   ##
===========================================
- Coverage    33.05%   32.88%   -0.17%     
===========================================
  Files           66       66              
  Lines         2538     2554      +16     
  Branches       144      133      -11     
===========================================
+ Hits           839      840       +1     
- Misses        1699     1714      +15

Impacted Files	Coverage Δ
...src/main/scala/com/stripe/rainier/core/Model.scala	`0% <0%> (ø)`	:arrow_up:
...main/scala/com/stripe/rainier/compute/Target.scala	`0% <0%> (ø)`	:arrow_up:
...rc/main/scala/com/stripe/rainier/core/Sample.scala	`0% <0%> (ø)`	:arrow_up:
...ain/scala/com/stripe/rainier/compute/RealOps.scala	`77.5% <66.66%> (-0.47%)`	:arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 99cb765...04fdf0c. Read the comment docs.

avibryant commented 4 years ago

Yes, you're asking the right question, and I've gone down the same train of thought myself. I go back and forth on this. My argument in favor of the current approach is:

first, the whole Unit thing does seem clunky to me
second, I think it's a bit clearer to separate out the Model stage from any predictions you might want to make with it; rather than returning a Prediction[T] where you've already locked in a particular prediction, you could return a Model plus some structure that contains some Reals for your parameters etc, and leave someone else to build a prediction from it.

Another point in the design space would be to get rid of Prediction and just have sample(g: Generator[T]): Sample[T] on Model; and if you want to pass around a not-yet-sampled Model + Generator you just do that as a tuple of those two things.

Thoughts?

andrew-stripe commented 4 years ago

I think I like the idea of starting with:

get rid of Prediction and just have sample(g: Generator[T]): Sample[T] on Model

At worse, Prediction comes back later and this ends up as sugar for the very common case of model.predict(g).toSample

avibryant commented 4 years ago

@andrew-stripe so this does something even more minimal, which is that Model has sample() which only takes sampler parameters, no generator, and then returns a Sample; and then you can run predict(g) on Sample to get a list of values.

stripe / rainier

Model.sample #409

Codecov Report