Closed bbbales2 closed 5 years ago
I'm OK with the proposed directional derivative tests instead of exhaustive input and output tests. @syclik --- do you have an opinion on this?
I'm not sure why test all combinations of inputs for prim/rev is called out separately.
It should be possible to use or borrow from the general testing framework (the one used for `operator_multiplication_test).
I'm OK dropping the vectorization tests there now and relying on a few explicit instantiations for each in the new framework.
The trick is going to be to build up functionality in small, understandable PRs.
On Mon, Aug 20, 2018 at 5:48 AM Bob Carpenter notifications@github.com wrote:
I'm OK with the proposed directional derivative tests instead of exhaustive input and output tests. @syclik https://github.com/syclik --- do you have an opinion on this?
I'm not sure what the alternative (exhaustive input and output tests) is.
For simplicity, I think the overall description / design is good.
I think the implementation should do a little more than just check the directional derivatives. Here's what I'm thinking: for anything vectorized, this sort of tests give a false sense of correctness. There are two cases I think we should cover (because we've introduced bugs this way before and I think we'll continue to be at risk because it's tricky). The two cases are where we've vectorized with the same var used multiple times and the other case where all the elements are independent vars.
We won't have this sort of testing requirement when using autodiffed operations. We get into this problem when we start specializing derivatives. There isn't really a lower spot to test it at.
So.... I know it adds complexity, but maybe a flag to just test the directional derivatives when it's a simply written function and something that will just include a simple construction of whatever vectorized version under both conditions? Not looking for that to be too crazy and full coverage. As long as the value of the derivative isn't trivial (maybe 0), it should be pretty easy to tell if this mistake was made.
I think with that, the tests become something that we can trust. If not, I think the person writing the function should add more tests to demonstrate that the vectorized code is correct.
Btw, I don't mean we have to test every derivative when it's vectorized.
Oh... and happy to discuss; perhaps there's a different way to prevent that sort of behavior.
I'm not sure why test all combinations of inputs for prim/rev is called out
separately.
It should be possible to use or borrow from the general testing framework (the one used for `operator_multiplication_test).
I'm OK dropping the vectorization tests there now and relying on a few explicit instantiations for each in the new framework.
The trick is going to be to build up functionality in small, understandable PRs
Agreed. Perhaps the first would be to have the simplest test with the interface defined. Then other PRs could be for adding more tests incrementally?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/stan-dev/math/issues/993#issuecomment-414260306, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZ_FzXyZ2wQ15M83p9lztFUq8z-iDQKks5uSoXigaJpZM4WDqB3 .
I should have asked this before: is this supposed to be a function with one return or multiple return values?
On Mon, Aug 20, 2018 at 9:54 AM Daniel Lee bearlee@alum.mit.edu wrote:
On Mon, Aug 20, 2018 at 5:48 AM Bob Carpenter notifications@github.com wrote:
I'm OK with the proposed directional derivative tests instead of exhaustive input and output tests. @syclik https://github.com/syclik --- do you have an opinion on this?
I'm not sure what the alternative (exhaustive input and output tests) is.
For simplicity, I think the overall description / design is good.
I think the implementation should do a little more than just check the directional derivatives. Here's what I'm thinking: for anything vectorized, this sort of tests give a false sense of correctness. There are two cases I think we should cover (because we've introduced bugs this way before and I think we'll continue to be at risk because it's tricky). The two cases are where we've vectorized with the same var used multiple times and the other case where all the elements are independent vars.
We won't have this sort of testing requirement when using autodiffed operations. We get into this problem when we start specializing derivatives. There isn't really a lower spot to test it at.
So.... I know it adds complexity, but maybe a flag to just test the directional derivatives when it's a simply written function and something that will just include a simple construction of whatever vectorized version under both conditions? Not looking for that to be too crazy and full coverage. As long as the value of the derivative isn't trivial (maybe 0), it should be pretty easy to tell if this mistake was made.
I think with that, the tests become something that we can trust. If not, I think the person writing the function should add more tests to demonstrate that the vectorized code is correct.
Btw, I don't mean we have to test every derivative when it's vectorized.
Oh... and happy to discuss; perhaps there's a different way to prevent that sort of behavior.
I'm not sure why test all combinations of inputs for prim/rev is called
out separately.
It should be possible to use or borrow from the general testing framework (the one used for `operator_multiplication_test).
I'm OK dropping the vectorization tests there now and relying on a few explicit instantiations for each in the new framework.
The trick is going to be to build up functionality in small, understandable PRs
Agreed. Perhaps the first would be to have the simplest test with the interface defined. Then other PRs could be for adding more tests incrementally?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/stan-dev/math/issues/993#issuecomment-414260306, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZ_FzXyZ2wQ15M83p9lztFUq8z-iDQKks5uSoXigaJpZM4WDqB3 .
@syclick: Which function are you asking about? It's supposed to be a test framework for all of our differentiable functions.
Suppose we a function f : R^N -> R^M
.
The exhaustive strategy is to test each N * M
first-order, N^2 * M
second-order, and N^3 * M
third-order derivatives.
The directional derivative strategy is to choose a few M
-vectors, say y1, ..., yK
and test the function lambda x. f(x).transpose() * yk
, which is R^N -> R
. That reduces testing load by a factor of M
without seeming to lose any information if the yk
aren't regular.
In general, this is going to work to test arbitrary functions of the form f:(R^N1 x ... x R^Nj) -> R^M
. The vectorized functions all have this form and my suggestion was that we just test them like all the unvectorized functions. So far, the vectorization is for distributions and for unary functions pretty much. I was just suggesting we not test the 10,000 possible instantiations of student-t, say, or the arbitrary number of instantiations of the vectorized functions (which work on arbitrary dimension arrays), but make a strategic selection of types to test.
So I don't see how this depends on automatic vs. hand-written derivatives or vectorized vs. unvectorized functions.
@Bob_Carpenter, thanks for the clarification! That helps a lot.
So I'm definitely with you that we don't need to do exhaustive testing. I'm trying to write out what I think are common, yet hard to debug errors in writing custom gradient code.
What I'm thinking is we're covered if we do this: f:(R^2 x R^2 x R^2 ... x R^2) -> R^M where we test where each of the M args are the same vars and where each of the M args are independent vars.
I think we'd still be down to R^(2N) -> R, but maybe I'm mistaken.
On Mon, Aug 20, 2018 at 11:01 AM Bob Carpenter notifications@github.com wrote:
@syclick https://github.com/syclick: Which function are you asking about? It's supposed to be a test framework for all of our differentiable functions.
Suppose we a function f : R^N -> R^M.
The exhaustive strategy is to test each N M first-order, N^2 M second-order, and N^3 * M third-order derivatives.
The directional derivative strategy is to choose a few M-vectors, say y1, ..., yK and test the function lambda x. f(x).transpose() * yk, which is R^N -> R. That reduces testing load by a factor of M without seeming to lose any information if the yk aren't regular.
In general, this is going to work to test arbitrary functions of the form f:(R^N1 x ... x R^Nj) -> R^M. The vectorized functions all have this form and my suggestion was that we just test them like all the unvectorized functions. So far, the vectorization is for distributions and for unary functions pretty much. I was just suggesting we not test the 10,000 possible instantiations of student-t, say, or the arbitrary number of instantiations of the vectorized functions (which work on arbitrary dimension arrays), but make a strategic selection of types to test.
So I don't see how this depends on automatic vs. hand-written derivatives or vectorized vs. unvectorized functions.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/math/issues/993#issuecomment-414347955, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZ_F37uDfOXMg-Qoljx9YjZzmKqLxMcks5uSs89gaJpZM4WDqB3 .
So with the lingo so far, f : R^N -> R^M
, then the thing I was talking about testing is the product w^T * J * v
, where J is the Jacobian of f, w is an M row random column vector and v is an N row random column vector.
Whether you're testing with fvars or vars, it's one forward or one reverse pass to get the number you wanna test. If you kept doing them, you could solve for the J. The finite difference stencils and complex step check check code would be simplified as well.
On Aug 20, 2018, at 7:41 PM, Ben Bales notifications@github.com wrote:
So with the lingo so far, f : R^N -> R^M, then the thing I was talking about testing is the product w^T J v, where J is the Jacobian of f, w is an M row random column vector and v is an N row random column vector.
Cool. I missed the final multiply by v.
Whether you're testing with fvars or vars, it's one forward or one reverse pass to get the number you wanna test.
That last multiply by v is another O(m) reduction in testing load.
If you kept doing them, you could solve for the J.
Would that require M passes?
I don't think this is necessary.
If there are special inputs, we can test for those in the same way.
The finite difference stencils and complex step check check code would be simplified as well.
I'm not sure what you mean by "stencil" but the functionals for finite diff should stay the same. We can add a functional to compute directional derivatives with reverse mode, but I don't know you'd use that in testing.
I want to make sure we don't lose track of @bbbales2's effort. His original PR was here: #1021.
We identified a couple problems. The first was that the variables weren't persisting on the stack so we were dealing with undefined behavior. We can fix this by lazily promoting the argument types to stan::math::var
types.
There's a really cool thing that @bbbales2 did inside his PR that we should take note of. His call_all_argument_combos
expands arguments into the exhaustive list of tuple of arguments. It's done using some neat recursion. We'll want to do the same using types instead of instantiated objects.
I updated the issue description with more information. I've been slowly working on this... I'm closer, but still not complete. Would love help if anyone is inclined.
I'm interested. What might be involved?
I'm interested. What might be involved?
Great question. First off, a lot of variadic template stuff. (I'm still getting fluent in it; assume C++14.)
If you want to know exactly what I'm thinking about, @bbbales2 put together something called call_all_arg_combos
or something like that on his branch. It essentially walks through each of the arguments, figures out what needs to be promoted (from double
to var
and it handles things like std::vector<double>
), then builds a typelist with the full combinatorial expansion of arguments AND it also carries the instance of this type which contains the promoted var
s. He cleverly built that recursively using a bunch of templated C++.
That was almost correct. Instead of promoting all the var
s eagerly, we want to promote lazily. That means we need to do the same thing that he did with the typelist, but having the promotion happen later. That way we can be safe about usage by adding var
s to the stack, running the gradient code, then recovering memory multiple times. So... the thing we need to do is do that combinatorial expansion with types, not implicitly with instances. Once we figure that out, we can do the rest. I was having trouble manipulating types to build that list... it's a lot easier for me to think about building a binary tree with recursion with objects, but it is definitely possible to do with types.
I don't know how much of that is understandable, but that's the problem I'm dealing with right now. If you want to help with this part, I can try to write a proper spec for what I'm doing.
Where is the code you were using to manipulate the list of types?
I noticed that an old branch cited in Implementation: feature/issue-993-testing
is deleted or otherwise missing.
Whoops. Wrong branch. It's this one: feature/automatic-autodiff-testing
I updated the PR description with the correct branch.
I figure this is also a non-issue with the new AD testing framework. Closing. If I missed something please reopen.
Edit (@syclik): 4/24/2019. Reducing the issue to just reverse mode. We can create new issues for forward and mix later.
Description
The goal here is to make a nice, generic testing framework built with the new C++14 stuff. This would be used from within Google Test.
Generically goal would be to test consistency of prim/var function implementations with any number of inputs automatically. This would make development of the Math library easier.
What is meant by consistency in all these cases is that, for a given set of inputs,
Other requirements:
Design
A possible interface looks like:
This will be able to handle
std::vector<double>
andstd::vector<int>
with this interface.NB: I had originally wanted this interface, but I couldn't see a way to implement it. There might be, but I couldn't figure it out.
Implementation
Technical things to make this possible:
feature/automatic-autodiff-testing
)var
as neededInstead of computing Jacobians of multiple input/non-scalar output functions. I'm more interested in dotting outputs with random vectors and testing gradients in random directions a few times (so tests are always on scalars). It might be too early optimization, but I'm not fond of the idea of test complexity scaling with number of inputs/outputs and I think this'd be just as good.
@bbbales2's implementation got close, but it promoted all the types, then ran the
gradient()
function, which is destroys the autodiff stack. We have to promote lazily. In other words, for a univariate function, this is how it should behave:The user calls
test_reverse_mode(test_function, 1.0);
. From withintest_reverse_mode
:double expected_result = test_function(1.0);
var x = to_var(1.0);
var result = test_function(x);
std::vector<double> grad; std::vector<var> vars = {x}; result.grad(x, vars, grad);
EXPECT_FLOAT_EQ(expected_result, result.val());
EXPECT_FLOAT_EQ(finite_diff(test_function, x), grad[0]);
(need some way to compute the finite diff.stan::recover_all_memory()
.When this is done with two arguments, we'll have to do steps 2-7 in a loop where we'll have to recursively walk through the possible combinations of
var
/double
for each argument.Additional Information
The exact features are unclear because I don't understand fully what is easily do-able.
Basically I need an issue to track this stuff.
This is building on stuff from a few testing frameworks that are already in Stan's unit tests:
Current Math Version
v2.18.0