Closed SteveBronder closed 1 year ago
(it says +20k lines but that's mostly just from the glm test csv)
This somehow got buried in my email, but I can review tomorrow or Friday.
I would like to have @rok-cesnovar review the matrix efficiency in all of this, but the basic code looks good.
Will do! Requested my review so I don't forget.
Thanks, @rok-cesnovar. I think it'll be easiest to do this after the fixes are in for all the change requests Lu and I made. At that point, we really only need the matrix algebra reviewed for efficiency. I'm much less worried about inefficiency than incorrectness, so it'd also be OK to skip the efficiency review or save it for later.
I think it'll be easiest to do this after the fixes are in for all the change requests Lu and I made. At that point, we really only need the matrix algebra reviewed for efficiency.
Yes of course. Fully agree.
I resolved the issues that I think are resolved. Sorry there were so many comments in this---it's a massive PR!
I'm happy to add all of the little doc things that I highlighted if you don't want to. I actually enjoy writing doc.
I ripped out all the parallelization stuff in single pathfinder. I need to answer a few more of the comments and fixup the docs and then I think this will be ready to be looked at again
Hey @SteveBronder, I think the UNSTABLE Jenkins status is an old CI unintended bug #3166. Fixing develop in should sort it out.
So I think this is all going well now and is ready for review (as long as tests pass).
Just for future reference, I have a version of this locally that does single pathfinder concurrently with the lbfgs iterations. It gives about a 15-20% speedup for single pathfinder but there's a few gotchas that would need to be sorted out if we want to use it in develop. I think for now it's better that we just use a single threaded single pathfinder, but I wrote the below for reference if someone has this idea in the future and wants to work on it.
Each pathfinder iteration only depends on the last lbfgs iteration so we can do a queue scheme. After each lbfgs iteration we copy the inputs needed for that pathfinder iteration, spawn a job in the thread pool, and store the results of the job in a concurrent queue. Then we spawning thread checks the queue to see if new/better results are available and if so updates the best known pathfinder iteration.
This gets bad in multi-pathfinder, where if someone has 8 cores and does multi-pathfinder with 8 single pathfinders, those jobs spawned from the single pathfinder are just going to sit around until a thread is available to start running them. Since in this case all 8 cores are used by all 8 pathfinders we will have tons of copies of the pathfinder job inputs sitting around.
I think the only easy way to resolve this for multi-path users is to look at the number of threads available and the number of single pathfinders in used in the multi pathfinder. If its 1:1 (8 cores / 8 pathfinders) then we have no concurrent jobs, if we have 2:1 (16 cores / 8 pathfinders) then we have 1 concurrent job per pathfinder. Though I think making the main threads wait while their jobs finish would add other overhead we'd have to think about
So because of the above I think it's better we don't do the concurrent scheme atm
per discussion: add arg to multi-pathfinder signature save_pathfinder_iterations
does this PR include changes to stan::callbacks::unique_stream_writer
?
plugging this branch into CmdStan and running example model bernoulli.stan
with the sampler results in this segfault (traceback from gdb
)
(gdb) info stack
#0 0x00007ff81555be7d in ?? ()
#1 0x0000000100165f60 in vtable for stan::callbacks::unique_stream_writer<std::__1::basic_ostream<char, std::__1::char_traits<char> > > ()
#2 0x0000000000000000 in ?? ()
The single path pathfinder diagnostic file will be in JSON format because the per-iteration information has a lot of structure that can't be flattened into CSV format. This file is used to figure out where pathfinder its problems. It report parameters, gradients, and Taylor approximation
Proposed JSON format:
{
{
// lbfgs
iter: 0
params: [...] // vector
gradients: [....] // vector
// taylor_approx_t
x_center: [...] // vector
logdetcholHk: [.] // double, Log deteriminant of the cholesky
L_approx [..., ...] //Matrix of Approximate choleskly
Qk: [..., ...]; //Matrix Q of the QR decompositon. Only used for sparse approx
alpha [...] // vector diagonal of the initial inv hessian
string use_full; // boolean indicationg if full or sparse approx was used.
}
}
Yeah for the diagnostic output I think we need to write a json_writer
that can take in a tuple of tuples and dispatches writes of the objects in the record correctly.
// begin record
diagnostic_writer.begin();
diagnostic_writer(
std::make_tuple(
std::make_tuple("params", Eigen::Map<Eigen::VectorXd>(cont_vector.data(), num_parameters).eval()),
std::make_tuple("gradients", Eigen::Map<Eigen::VectorXd>(g1.data(), num_parameters).eval())
)
);
// ...
// end record
diagnostic_writer.end();
is there a reason why output columns from single- and multi-path pf are in different order? multi-path: algo outputs "lp_approx" and "lp" are first. single-path: params first.
convention (although probably unwritten and maybe not always respected) is that all "__" columns come first. some pkgs might check that first column is "lp__" as a kind of sanity check on the contents. why put "lpapprox_\" first?
Pathfinder output diagnostic files missing newline at end.
Awesome!! I'm going through right now to double check stuff and fixed the issues you pointed out above. Once tests pass I will merge!
@mitzimorris done!! Now we just need to hook it up to cmdstan and whatnot
Thank you so much for all the excellent work. Please contact me if I can help.
From: Steve Bronder @.> Sent: Tuesday, June 6, 2023 7:54 AM To: stan-dev/stan @.> Cc: Lu Zhang @.>; Mention @.> Subject: Re: [stan-dev/stan] Add Pathfinder and Multi Pathfinder (PR #3123)
@mitzimorrishttps://urldefense.com/v3/__https://github.com/mitzimorris__;!!LIr3w8kk_Xxm!sDYaSwtxyeTTQbLRUfOeIhHUHmdt2wY2-iY8FEbW5An9zGg458VvEufvmVP3TcRdDmIlQbJ-qe0_8U9B9gz7sQNr8g$ done!! Now we just need to hook it up to cmdstan and whatnot
— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/stan-dev/stan/pull/3123*issuecomment-1578921271__;Iw!!LIr3w8kk_Xxm!sDYaSwtxyeTTQbLRUfOeIhHUHmdt2wY2-iY8FEbW5An9zGg458VvEufvmVP3TcRdDmIlQbJ-qe0_8U9B9gw4pVLHdw$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AFUN2LIQGH66F7T62T3FPPDXJ5ADJANCNFSM5UQ34YGA__;!!LIr3w8kk_Xxm!sDYaSwtxyeTTQbLRUfOeIhHUHmdt2wY2-iY8FEbW5An9zGg458VvEufvmVP3TcRdDmIlQbJ-qe0_8U9B9gyTsxjNrg$. You are receiving this because you were mentioned.Message ID: @.***>
@SteveBronder starting work on CmdStan, but looks like merge into develop failed.
Alrighty I can take a look tmrw. That branch I wrote a while ago is kind of old, maybe good to just start with a new one?
Merging this broke the Windows CI, which is why the submodules in CmdStan haven't updated: https://jenkins.flatironinstitute.org/blue/organizations/jenkins/Stan%2FStan/detail/develop/131/pipeline
Submission Checklist
./runTests.py src/test/unit
make cpplint
Summary
Adds Pathfinder and Multi pathfinder to the service APIs. This is a WIP until I add the final tests some individual functions inside of single pathfinder, but I think both algorithms are ready for review. Both algos can be found in
src/stan/services/
along with the functions for generating PSIS weights for multi pathfinder.Intended Effect
How to Verify
This PR needs a few more tests for some functions used in single pathfinder. But I had tests for the psis functions as well as higher level tests for single and multi-pathfinder that can be run with
Side Effects
I think mostly what we want to verify is that the functions in single pathfinder lineup with the math in the paper.
Also I'm happy to delete the debug functions, though they should be essentially no-ops without any of the macros turned on and are nice for debugging.
Documentation
Docs added for each function that is more than encapsulating a one liner.
Copyright and Licensing
Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Steve Bronder
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses: