Avoid leaky towers of kwargs

amilsted commented 2 years ago

I think it would be nice to remove the leaks from the "kwargs tower" in this package. Lots of functions ultimately pass kwargs down to tdvp(direction::Base.Ordering, solver, PH, time_step::Number, psi::MPS; kwargs...) in tdvp_step.jl, but there is no checking for "leftover" kwargs that are never used.

I want to know if I am supplying a kwarg that isn't used anywhere!

Perhaps one fix is to have explicit keywords in the method signature for that particular function, or alternatively to pop() from kwargs and check it's empty at the end?

PS: Is there a way to pass kwarg docstrings back up the stack to the docstrings of higher-level functions?

amilsted commented 2 years ago

One possible issue: kwargs is also passed to the user-defined checkdone() in tdvp_generic.jl. Perhaps we can provide a separate checkdone_kwargs for that?

emstoudenmire commented 2 years ago

While I fully agree it's good to get an error message if you supply a kwarg that isn't used anywhere, right now this seems like an issue where the design of Julia offers some tradeoffs and hard to make the perfect one.

Currently the code in this package is rather explicit about the kwargs that get passed down to each function (from tdvp, to tdvp_sweeps, to tdvp_step, etc.). One downside, though, is that it's a lot of work to "plumb" a new kwarg through, especially to lower levels like the solver. Some of the kwargs aren't getting passed, so there are now another kind of silent bug.

Yet another downside is that now the code is very long and verbose with lots of explicit kwarg reading and passing everywhere. It's harder to maintain this way I believe.

Unless there is a third choice besides:

just passing kwargs... through all the way down
explicitly listing and unwrapping and re-passing 10+ kwargs to all the intermediate functions just so they can get passed to the lowest levels

then I would vote we live with #1, and the unfortunate silent failures / lack of error messages that can result from this approach.

Happy to discuss more, because maybe there's a way to balance these considerations that I missed.

amilsted commented 2 years ago

Yeah, the current code is relatively explicit in a few places. I think it's a side-effect of the "process the kwargs in the function body with get() pattern"? I think I prefer multi-line method signatures for getting keyword arguments needed in the present function, combined with bundled passing of kwargs... otherwise.

I'm mainly concerned with documentation and catching unused kwargs. Perhaps the suggestion here to used structs to hold large sets of arguments, is the "right" alternative to using kwargs.

Anyway, to catch unused kwargs, it is enough if the "tower" has a "bottom".

function ftop(a,b; kw1=nothing, kwargs...)
    ...
    fmid(b; kw2=whatever, kwargs...)  # this function does not get to see `kw1`
end

function fmid(b; kw2=nothing, kwargs...)
    ...
    fbottom(c; kw3=something, kwargs...)
end

function fbottom(c; kw3=nothing, kw4=nothing)   # DOES NOT SLURP KWARGS
   return "oh hi"
end

ftop(a,b; kw5="muahaha")   # unused kwarg causes error to be thrown

WIth the "get kwargs in the body" pattern, one could achieve similar by "popping" kwargs rather than "getting" them. Unfortunately, this does not solve the issue with documenting all these arguments and making them discoverable! The struct solution is much better for those things.

emstoudenmire commented 2 years ago

Thanks for the helpful and interesting discussion. That struct approach in the link is an interesting one, and points to how a keyword system, as good as Julia's is, could be even better in principle (and maybe a combination of existing language features plus some new ones in the future could lead to better practices).

I see really well what you mean now about:

the 'tower' having a base that doesn't slurp
the idea of "popping" by getting the kwargs in the function definition, and avoiding the get approach. I think I was the one who started using that approach a lot in ITensors.jl but now in hindsight I'm not really sure why I thought it was needed. (There are some corner cases where it can be better, but only pretty rare ones maybe.)

So I think that pair of best practices above goes a long way toward fixing issues with keyword args. Thanks –

mtfishman commented 2 years ago

I agree that I don't like the current way we handle keyword arguments, I believe it is a historical artifact from porting the C++ DMRG code. @amilsted I think the approach you outline is good, and it would be nice to avoid using get if possible.

mtfishman commented 2 years ago

Also I think we should make use of structs/NamedTuples to pass certain groups of keyword arguments that we know will all get passed to one function lower down (such as keyword arguments for the solver).

emstoudenmire commented 2 years ago

Agreed we should think of these as the new “best practices” for this package and others. I’ll start incorporating these patterns into my PRs.

On Tue, Oct 11, 2022 at 6:04 PM Matt Fishman @.***> wrote:

Also I think we should make use of structs/NamedTuples to pass certain groups of keyword arguments that we know will all get passed to one function lower down (such as keyword arguments for the solver).

— Reply to this email directly, view it on GitHub https://github.com/ITensor/ITensorTDVP.jl/issues/27#issuecomment-1275327290, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHJ3RNPCLISMOFDG26VNA3WCXP63ANCNFSM5XSWXGSA . You are receiving this because you commented.Message ID: @.***>

-- -=Miles Stoudenmire=- @. @. http://itensor.org/miles/

emstoudenmire commented 2 years ago

So in working on the linsolve PR I am running into one issue where I can't quite reach the ideal of catching all the kwargs.

The issue is that I run into the following choices:

I put a precise list of kwargs in my solver definition, since it is the "lowest level of the tower" in the above design suggestion from Ash and I want kwarg misspellings to get caught and throw an error. However, then the generic tdvp function is passing other kwargs not relevant to my solver, which I didn't put and don't want to have to catch, so I get an error.
To fix this, I add kwargs... to the end of my solver function, but now I've lost the behavior where a misspelled kwarg will be caught.

So maybe this is a fundamental tension with generic code and catching all kwarg misspellings? We could in the solvers just always catch all possible solver arguments, and not slurp (no kwargs...), but then if we add a new kwarg for one solver we'd have to add it to all the solvers and it would make the process of defining new solvers more rigid and difficult.

Do either of know a way around this tensor @amilsted @mtfishman ?

amilsted commented 1 year ago

What are these other kwargs that are being passed to the solver? The idea would be to "pop" them all as we go down the levels, so that only the relevant ones are left at the bottom level. This can be done by just catching the ones we want in the call signature.

There is a caveat I just remembered. All bottom-level functions have to be non-slurping for this to work, so you don't want to pass your kwargs down to a user-supplied callback function.

emstoudenmire commented 1 year ago

First of all, agreed that all of the functions at the bottom of the tower (solvers in this case) would have to be non-slurping. I believe that's manageable at the level of the solvers included with the package. Then if an outside user inputs a solver which is slurping, then they will be taking on the risk that a misspelled kwarg might then get missed, and that's ok because it's just a mechanism of Julia that we don't have a way around.

Good question about the other kwargs that get passed to the solver. A lot do get popped on the way down, or will in an upcoming PR meant to address this issue.

However, here's the issue I'm seeing (both narrowly and broadly):

right now, when I make the linsolve solver non-slurping, I get an error related to the fact that it isn't taking a kwarg called current_time
narrowly this could be fixed by either making sure to pop current_time if it isn't meant for solvers
but if it is meant for solvers (and other kwargs certainly may well be, which are used for some solvers but not by the linsolve one) then we end up in a situation where all the solvers have to take all the solver kwargs. To be clear, linsolve has no use for current_time but in the current situation might have to accept it by name anyway.

So I was just looking for a way around that kind of "rigidity" of the design, where all the solvers have to take all the other solvers' args to work.

Here could be another, and possibly bigger, reason to adopt @mtfishman's point about using NamedTuples. If we make it so that the common kwarg at the bottom, for solvers, is a single kwarg called solver_kwargs which is a NamedTuple then it seems to nicely solve this issue. Because of course there's no problem requiring all the solvers to take that one kwarg and then put whatever sub-kwargs inside it or not as desired.

But none of this was particularly obvious to me before this issue thread, so I am appreciating the back and forth here and I think we are converging on something really good.