Knowing if optimization (or variational inference) is successful or not by a value

stan-dev / cmdstanr

CmdStanR: the R interface to CmdStan

https://mc-stan.org/cmdstanr/

Other

143 stars 62 forks source link

Knowing if optimization (or variational inference) is successful or not by a value #332

Closed beyondpie closed 4 years ago

beyondpie commented 4 years ago

When I use the optimization, I can see the logs in the screen, which tells me if the optimization faces some problem and fails, such as infinity log probability.

Is that possible I can get the status from a value returned by some function for optimization (or variational inference) ? I face this issue when I want to use optimization for thousands of times.

rok-cesnovar commented 4 years ago

Right now we do not expose return codes to the user. Though I think having something like fit$process_return_code() or just fit$return_code() is a good idea. @jgabry what do you think?

Until we decide on and implement that you could do this by checking fit$output(). Those are stored standard outputs. You can check for specific prints in cases of error. Not ideal but I guess it could help?

For example for a check for bgfs/lbgfs can be done with:

# line = line from fit$output()
if (regexpr("Optimization terminated with error", line, perl = TRUE) > 0) {
 # do something in case of error
}

beyondpie commented 4 years ago

@rok-cesnovar I notice that fit$output:

function (id = NULL) 
{
    cat(paste(self$runset$procs$proc_output(1), collapse = "\n"))
}

And I check the fit$runset$procs, there are two functions: is_finished and is_failed. I guess fit$runset$procs$is_finished() check if the optimization get the points given the limited iteration number, and fit$runset$procs$is_failed() check if the optimization faces some uncommon situation, such as infinity log prob ?

Am I right ?

rok-cesnovar commented 4 years ago

Unfortunately no. is_finished() basically means that the process that ran the executable is no longer running. and is_failed() is only marked as TRUE if optimization failed before completing (process either crashed or was not given all input data or was not able to initialize).

Actually you can check this with: fit$runset$procs$get_proc(1)$get_exit_status() This returns the exit return code of optimization for example. Its 0 if successful, and I think 70 when optimization did not converge. For variational you can use the same call. For sampling you need to replace 1 with the chain ids.

rok-cesnovar commented 4 years ago

I am guessing we could just add fit$runset$procs$return_codes() and dont really need fit$process_return_code().

beyondpie commented 4 years ago

@rok-cesnovar Thanks! I see 0 for my optimization when running fit$runset$procs$get_proc(1)$get_exit_status(). I will use this to check the optimization and variational inference. Thanks.

BTW, about variational inference. I notice that the results are the samples, can I

get the fitted parameters. I know Stan uses ADVI for the approximation, then ideally I should get the mean and variance for the unconstrained variables, right? Yes, I also need to get the transformation mappings between the constrained and unconstrained ones. So this feature might be not workable.
the samples are sampled after the convergence of the variational inference or just like HMC (sampling while learning the approximated distribution)? If just like HMC, it seems that these samples may be not good to directly be used as the approximations for the corresponding variables. I cannot find any material talking about this.

Thanks!

mitzimorris commented 4 years ago

a few observations:

the sampler, the optimizer, and VI were implemented at different times by different developers with different interpretations of "success" and "failure".
the default use case was always running Stan in an interactive session, using text messages to stdout and stderr
the Stan services module use of return codes is based on a set of POSIX return codes for middleware; the fact that these numeric codes are returned to the interfaces is of minimal use - see above point

put that together, and it's really hard to encapsulate the notion of "success" and "failure" for a production workflow using return codes - we need to grab and parse stdout and stderr differently for each method.

beyondpie commented 4 years ago

@mitzimorris Could I totally ignore the possibly different interpretations of "success" and "failure"? I mean, I can just use fit$runset$procs$get_proc(1)$get_exit_status() to check if success or not? If not zero, then I just choose to not believe the results.

rok-cesnovar commented 4 years ago

If not zero, then I just choose to not believe the results

I think this is completely fine.