Add Pathfinder and Multi Pathfinder

SteveBronder commented 2 years ago

Submission Checklist

[x] Run unit tests: ./runTests.py src/test/unit
[x] Run cpplint: make cpplint
[x] Declare copyright holder and open-source license: see below

Summary

Adds Pathfinder and Multi pathfinder to the service APIs. This is a WIP until I add the final tests some individual functions inside of single pathfinder, but I think both algorithms are ready for review. Both algos can be found in src/stan/services/ along with the functions for generating PSIS weights for multi pathfinder.

Intended Effect

How to Verify

This PR needs a few more tests for some functions used in single pathfinder. But I had tests for the psis functions as well as higher level tests for single and multi-pathfinder that can be run with

./runTests.py -j4 ./src/test/unit/services/pathfinder/*

Side Effects

I think mostly what we want to verify is that the functions in single pathfinder lineup with the math in the paper.

Also I'm happy to delete the debug functions, though they should be essentially no-ops without any of the macros turned on and are nice for debugging.

Documentation

Docs added for each function that is more than encapsulating a one liner.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Steve Bronder

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

SteveBronder commented 2 years ago

(it says +20k lines but that's mostly just from the glm test csv)

bob-carpenter commented 2 years ago

This somehow got buried in my email, but I can review tomorrow or Friday.

rok-cesnovar commented 2 years ago

I would like to have @rok-cesnovar review the matrix efficiency in all of this, but the basic code looks good.

Will do! Requested my review so I don't forget.

bob-carpenter commented 2 years ago

Thanks, @rok-cesnovar. I think it'll be easiest to do this after the fixes are in for all the change requests Lu and I made. At that point, we really only need the matrix algebra reviewed for efficiency. I'm much less worried about inefficiency than incorrectness, so it'd also be OK to skip the efficiency review or save it for later.

rok-cesnovar commented 2 years ago

I think it'll be easiest to do this after the fixes are in for all the change requests Lu and I made. At that point, we really only need the matrix algebra reviewed for efficiency.

Yes of course. Fully agree.

bob-carpenter commented 2 years ago

I resolved the issues that I think are resolved. Sorry there were so many comments in this---it's a massive PR!

I'm happy to add all of the little doc things that I highlighted if you don't want to. I actually enjoy writing doc.

SteveBronder commented 1 year ago

I ripped out all the parallelization stuff in single pathfinder. I need to answer a few more of the comments and fixup the docs and then I think this will be ready to be looked at again

serban-nicusor-toptal commented 1 year ago

Hey @SteveBronder, I think the UNSTABLE Jenkins status is an old CI unintended bug #3166. Fixing develop in should sort it out.

SteveBronder commented 1 year ago

So I think this is all going well now and is ready for review (as long as tests pass).

Just for future reference, I have a version of this locally that does single pathfinder concurrently with the lbfgs iterations. It gives about a 15-20% speedup for single pathfinder but there's a few gotchas that would need to be sorted out if we want to use it in develop. I think for now it's better that we just use a single threaded single pathfinder, but I wrote the below for reference if someone has this idea in the future and wants to work on it.

Each pathfinder iteration only depends on the last lbfgs iteration so we can do a queue scheme. After each lbfgs iteration we copy the inputs needed for that pathfinder iteration, spawn a job in the thread pool, and store the results of the job in a concurrent queue. Then we spawning thread checks the queue to see if new/better results are available and if so updates the best known pathfinder iteration.

This gets bad in multi-pathfinder, where if someone has 8 cores and does multi-pathfinder with 8 single pathfinders, those jobs spawned from the single pathfinder are just going to sit around until a thread is available to start running them. Since in this case all 8 cores are used by all 8 pathfinders we will have tons of copies of the pathfinder job inputs sitting around.

I think the only easy way to resolve this for multi-path users is to look at the number of threads available and the number of single pathfinders in used in the multi pathfinder. If its 1:1 (8 cores / 8 pathfinders) then we have no concurrent jobs, if we have 2:1 (16 cores / 8 pathfinders) then we have 1 concurrent job per pathfinder. Though I think making the main threads wait while their jobs finish would add other overhead we'd have to think about

So because of the above I think it's better we don't do the concurrent scheme atm

mitzimorris commented 1 year ago

per discussion: add arg to multi-pathfinder signature save_pathfinder_iterations

mitzimorris commented 1 year ago

does this PR include changes to stan::callbacks::unique_stream_writer ? plugging this branch into CmdStan and running example model bernoulli.stan with the sampler results in this segfault (traceback from gdb)

(gdb) info stack
#0  0x00007ff81555be7d in ?? ()
#1  0x0000000100165f60 in vtable for stan::callbacks::unique_stream_writer<std::__1::basic_ostream<char, std::__1::char_traits<char> > > ()
#2  0x0000000000000000 in ?? ()

mitzimorris commented 1 year ago

The single path pathfinder diagnostic file will be in JSON format because the per-iteration information has a lot of structure that can't be flattened into CSV format. This file is used to figure out where pathfinder its problems. It report parameters, gradients, and Taylor approximation

Proposed JSON format:

{
  {
    // lbfgs
    iter: 0
    params: [...] // vector
    gradients: [....] // vector
    // taylor_approx_t
    x_center: [...] // vector
   logdetcholHk: [.] // double, Log deteriminant of the cholesky
    L_approx [..., ...]  //Matrix of Approximate choleskly
    Qk: [..., ...];  //Matrix Q of the QR decompositon. Only used for sparse approx
    alpha [...]  // vector diagonal of the initial inv hessian
    string use_full;  // boolean indicationg if full or sparse approx was used.
  }
}

SteveBronder commented 1 year ago

Yeah for the diagnostic output I think we need to write a json_writer that can take in a tuple of tuples and dispatches writes of the objects in the record correctly.

// begin record
     diagnostic_writer.begin();
      diagnostic_writer(
        std::make_tuple(
          std::make_tuple("params", Eigen::Map<Eigen::VectorXd>(cont_vector.data(), num_parameters).eval()),
          std::make_tuple("gradients", Eigen::Map<Eigen::VectorXd>(g1.data(), num_parameters).eval())
          )
      );
// ...
// end record
diagnostic_writer.end();

mitzimorris commented 1 year ago

is there a reason why output columns from single- and multi-path pf are in different order? multi-path: algo outputs "lp_approx" and "lp" are first. single-path: params first.

convention (although probably unwritten and maybe not always respected) is that all "__" columns come first. some pkgs might check that first column is "lp__" as a kind of sanity check on the contents. why put "lpapprox_\" first?

mitzimorris commented 1 year ago

Pathfinder output diagnostic files missing newline at end.

SteveBronder commented 1 year ago

Awesome!! I'm going through right now to double check stuff and fixed the issues you pointed out above. Once tests pass I will merge!

SteveBronder commented 1 year ago

@mitzimorris done!! Now we just need to hook it up to cmdstan and whatnot

LuZhangstat commented 1 year ago

Thank you so much for all the excellent work. Please contact me if I can help.

From: Steve Bronder @.> Sent: Tuesday, June 6, 2023 7:54 AM To: stan-dev/stan @.> Cc: Lu Zhang @.>; Mention @.> Subject: Re: [stan-dev/stan] Add Pathfinder and Multi Pathfinder (PR #3123)

@mitzimorrishttps://urldefense.com/v3/__https://github.com/mitzimorris__;!!LIr3w8kk_Xxm!sDYaSwtxyeTTQbLRUfOeIhHUHmdt2wY2-iY8FEbW5An9zGg458VvEufvmVP3TcRdDmIlQbJ-qe0_8U9B9gz7sQNr8g$ done!! Now we just need to hook it up to cmdstan and whatnot

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/stan-dev/stan/pull/3123*issuecomment-1578921271__;Iw!!LIr3w8kk_Xxm!sDYaSwtxyeTTQbLRUfOeIhHUHmdt2wY2-iY8FEbW5An9zGg458VvEufvmVP3TcRdDmIlQbJ-qe0_8U9B9gw4pVLHdw$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AFUN2LIQGH66F7T62T3FPPDXJ5ADJANCNFSM5UQ34YGA__;!!LIr3w8kk_Xxm!sDYaSwtxyeTTQbLRUfOeIhHUHmdt2wY2-iY8FEbW5An9zGg458VvEufvmVP3TcRdDmIlQbJ-qe0_8U9B9gyTsxjNrg$. You are receiving this because you were mentioned.Message ID: @.***>

mitzimorris commented 1 year ago

@SteveBronder starting work on CmdStan, but looks like merge into develop failed.

SteveBronder commented 1 year ago

Alrighty I can take a look tmrw. That branch I wrote a while ago is kind of old, maybe good to just start with a new one?