Closed JoeyT1994 closed 2 months ago
Can't the BP code detect if the norm is zero at some point? Presumably the message tensors become zero?
Note that I was advocating for removing hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork)
and check_hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork)
here: https://github.com/mtfishman/ITensorNetworks.jl/pull/144#issuecomment-1983581282, in case that is relevant.
Note that I was advocating for removing
hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork)
andcheck_hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork)
here: #144 (comment), in case that is relevant.
Hmm yeah I'm not sure what the bug is here because I don't see it locally on my machine. I thought something had changed in ITensors.jl
and so it was importing something that no longer exists. Let me try to find the issue.
Can't the BP code detect if the norm is zero at some point? Presumably the message tensors become zero?
Yeah I wonder if the norm of the message tensors is zero then does that guarantee (assuming you didn't initialize the message tensors to 0) that the norm of the wave function is zero? I assume so? In that case we could have that catch in the inner(alg::Algorithm"bp", ...)
code?
Can't the BP code detect if the norm is zero at some point? Presumably the message tensors become zero?
Yeah I wonder if the norm of the message tensors is zero then does that guarantee (assuming you didn't initialize the message tensors to 0) that the norm of the wave function is zero? I assume so? In that case we could have that catch in the
inner(alg::Algorithm"bp", ...)
code?
I suppose a message could end up accidentally being zero without the norm of the network being zero, however that seems unlikely and also seems like it would be a problem for BP anyway since then it would always remain zero and cause other messages to become zero during updates involving that message. I suppose a fix for that situation could be to randomize the message and see if it ends up back at zero or not. But my point is that this seems like something that we should try to handle at the level of the BP code, rather than in a higher level code like inner, since this could be a potential issue for many functions that use BP, not just inner.
I think probably for running BP on a norm network you can show that the only way a message can end up zero is if the network is zero or if it is initialized to zero, since each update is of the form of a CP-map.
Note that I was advocating for removing
hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork)
andcheck_hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork)
here: #144 (comment), in case that is relevant.Hmm yeah I'm not sure what the bug is here because I don't see it locally on my machine. I thought something had changed in
ITensors.jl
and so it was importing something that no longer exists. Let me try to find the issue.
It may be related to https://github.com/ITensor/ITensors.jl/pull/1356, we moved all of the MPS/MPO code into an ITensorMPS
submodule of ITensors.jl
, which changed the namespaces of some internal functions, which probably included check_hascommoninds
(i.e. check_hascommoninds
probably isn't available from ITensors
, and instead must be accessed from ITensors.ITensorMPS
).
But if we have to circumvent issues around check_hascommoninds
anyway, I think we may as well just remove it since I don't think it is serving a good purpose in ITensorNetworks.
Okay I can remove it, shall I remove the calls to check_hascommoninds
in the treetensornetwork
code then? Presumably it is just being used for some assertions
Okay I can remove it, shall I remove the calls to
check_hascommoninds
in thetreetensornetwork
code then? Presumably it is just being used for some assertions
Yeah, thanks! Can you also remove hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork)
as well?
@mtfishman actually this seems to be happening with more than just check_hascommoninds
due to the ITensors.jl
changes, I can try to hunt them down
Seems like we may as well just remove src/treetensornetworks/solvers/deprecated
entirely instead of trying to update it.
Also src/treetensornetworks/deprecated_opsum_to_ttn.jl
.
@JoeyT1994 reconsidering this, I'm concerned this PR is going to get messy with all of these other changes. Do you mind if I make a PR updating to the latest ITensor version, removing deprecated code, removing check_hascommoninds
, etc. and then you can merge that into this PR?
I think probably for running BP on a norm network you can show that the only way a message can end up zero is if the network is zero or if it is initialized to zero, since each update is of the form of a CP-map.
Yes you are right that if it is a CP-map a message being zero directly implies on of those two things. Perhaps in a non CP-map there is a possibility the outgoing message ends up being zero by virtue of the incoming message being a zero eigenvector of the local tensor - but that seems a weird case and likely implies the tensor contracts to 0 too.
In principle if a message is 0 then the BP code should rapidly converge to 0 in a few sweeps of the code (as all the other messages end up zero) and it implies something about the tensornetwork. So it seems to me this isn't actually something that the BP code needs to catch
for and is part of the algorithm. The exception is if we are trying to normalize the message tensors, in which case we should flag an error?
@mtfishman I have updated the PR based on what we spoke about on Wednesday.
inner(x, A, y)
passes to BiLinearForm(A, x, y)
which does the mapping on x
to bring it into the dual space. logscalar()
is now generic to any algorithm where the cache has scalar_factors(cache)
defined on itinner
applies to.Let me know your thoughts.
Attention: Patch coverage is 69.73684%
with 46 lines
in your changes are missing coverage. Please review.
Project coverage is 82.32%. Comparing base (
ae4ad2c
) to head (0026623
). Report is 1 commits behind head on main.
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@mtfishman let me know if you have any further comments or if you are happy to merge
I'll take another look through the PR.
@JoeyT1994 I merged #155 into main
and then tried merging it into this branch and fixing conflicts but it seems like it broke some BP tests... I had to fix some issues in the tests in #155 because of changes to the normalization of certain tensor network constructors but I'm not sure if that is related.
Okay no worries let me pull that merge to my local machine and try to resolve the test failures.
Thanks. That change was supposed to be pretty benign but perhaps the behavior of some of the constructors changed accidentally.
Yeah the issue is that now scalar(alg::Algorithm"bp", tn::AbstractITensorNetwork, ...)
seems to not be returning the correct answer any more. Can't quite figure out why yet.
Okay actually I don't think its a BP issue:
The inner_network
constructor (which passes to BiLinearFormNetwork
and inserts an identity operator in the middle) is doing something different now than what it was, i.e.
ψ = random_tensornetwork(s; link_space=χ)
contract(inner_network(ψ, ψ)) == contract(ψ ⊗ prime(dag(ψ); sites=[]))
Now gives false.
I see where the issue is, its the delta_network(union_all_inds(siteinds(ket), dual_site_index_map(siteinds(ket))))
constructor, it appears to be inserting matrices like:
1 0
0 0
on the vertices instead of identity matrices.
I think that may have been an incorrect usage of delta_network
, see my messages on Slack.
This PR refactors the
inner
interface inITensorNetworks.jl
. Most importantly it adds algorithmic backends where "bp" and "exact" are currently supported.For instance
inner(x::AbstractITensorNetwork, A::AbstractITensorNetwork, y::AbstractITensorNetwork; alg = default_alg)
andinner(x::AbstractITensorNetwork, y::AbstractITensorNetwork; alg = default_alg)
will default to "bp" with a sensible partitioning if all the arguments are trees and "exact" otherwise.The assumption in
inner(...)
is that theexternalinds
of the arguments are all aligned and thus the networks stack properly underdisjoint_union
and don't need mapping: although an optionalkwarg
link index map (which is defaulted tosim
) is provided forx
to avoid clashes. This is consistent with the interface forITensors.jl
at the moment.Logarithmic versions of
inner
, i.e.loginner()
are also provided. In the "bp" case this is neatly defined and helps to avoid overflow/ underflow. In the "exact" case it just callslog(inner(...))
so perhaps it is a bit redundant.Testing is added via
test_inner.jl
and other standard tests which rely oninner
will now direct to this and all pass. Additionally, tests intest_treetensornetworks
will pass to this version and not the version insrc/treetensornetworksabstracttreetensornetwork.jl
unless the tensor network is specifically wrapped in theTTN
type.A few points of note/ things to think about resolving:
x
are 0 which means thatnorm_sqr(x::AbstractITensorNetwork) = 0
then the BP method will returnNaN
as it will return the ratio 0/0. Hence this is an edge case that isn't well determined by the BP code. This is because the current BP backend computesinner()
via a global method. A local method is possible to define for a tree to solve this (it is how Benedikt's code insrc/treetensornetworksabstracttreetensornetwork.jl
works), and we could then definealg = "bp_global"
andalg = "bp_local_tree"
to let the user choose algorithms.inner(alg::Algorithm"exact", ...)
passes to theinner_network(...)
constructor which I cleaned up a bit:inner_network
now forwards toflatten_networks
but takes thedag
on the first argument and its keyword arguments default to no flattening or combining (which are usefulkwargs
for "exact").inner(alg::Algorithm"bp",...)
, instead, internally builds the relevantITensorNetwork
viadisjoint_union
. This is because theflatten_networks
builder results in confusing vertex names for more than two networks and thus it is difficult to parse when trying to define a gooddefault_partitioning
.@mtfishman @b-kloss