Open mtfishman opened 2 months ago
Really interesting proposals. I like the last part about how the various things that make ITensor a nice project can be decoupled from each other, like the idea of a named dimension separate from what the name should actually be (e.g. could just be some identifier like a string instead of an IndexName
carrying an id, prime level, and tags). In machine learning, the idea of a named tensor has been popular for a long time and so has the idea of using symmetric tensors inside of ML architectures, but I don't think there has been a de facto winner in terms of actual tensor libraries people use there (of course people do use libraries like PyTorch or TensorFlow which but my understanding is these offer only rudimentary tensor operations).
The Dict-based tags proposal seems good to me also. I like the interface which it provides, which seems clearly superior to the current one especially in terms of customizability / flexibility and also clarity about what the various tags are for in each context (e.g. we could more clearly delineate a site type for making operators, versus just having some conventions around certain reserved names, which would also make that system less magical seeming for users).
My main question at this point is about any possible memory or performance overhead of this design versus the current one. How long does it take to make, say, 1000 such indices (of course with BenchmarkTools I guess one can benchmark this more precisely than just making 1000 of them)? Does the new design substantially increase the memory footprint for networks on large graphs with small index dimensions (e.g. a large square lattice with bond dimensions all equal to 2)?
I also thought of NamedTuple when reading about the dictionary design, but the part about having potentially many different index types as a result seems like too big of a drawback to me for the reasons you said, unless the type could be hidden without too much cost.
Performance is definitely the biggest potential concern, however a network with 1000 indices would also have that order of number of tensors which have to be contracted/manipulated, so constructing indices is likely not the bottleneck there.
Perhaps index creation could become some kind of bottleneck for lazy/symbolic tensor operations but again in that case likely you are searching for optimal contraction sequences/taking symbolic derivatives, and in those applications comparing indices is the bottleneck operation rather than construction, and one could create temporary index names/labels (say using sequential integer labels or hashes of the indices) to speed up those comparisons as needed, which is what we do in the optimal contraction sequence optimization algorithm at the moment.
Indices are compared quite a bit in ITensor, say in tensor contraction and addition, so it may be a good idea to store the hash of the Index as part of the Index anyway.
EDIT: Regarding the memory footprint, definitely the footprint may increase noticeably in the limit of very small tensors (or lazy/symbolic tensors). However, in those cases where that actually causes an issue, I would say that those users should probably define their own simpler AbstractNamedUnitRange
/AbstractNamedDimArray
subtype with simpler index/dimension names, with the downside that the printing won't be as nice and they may not get access to certain operator definitions.
Interesting idea to "cache the has" for Index names. Makes sense if comparison is usually the bottleneck. I do agree that in most contexts, memory is not an absolute concern because usually the indices will be attached to a tensor which then has a lot of elements inside, so the indices themselves are usually relatively small in terms of memory footprint. Even for the contraction sequence calculations, a fair comparison point might not be how fast that part could be in principle, but how fast it is compare to the subsequent contraction involving actual tensors that will happen afterward. (Basically the idea of asking how much that part needs to be optimized in terms of whether it's ever the bottleneck of a larger sequence of operations. We would just have to see 'in the wild'.)
Computing optimal contraction sequences is in general exponential in the number of tensors in the network so it definitely can be the bottleneck and there are definitely cases where finding the optimal sequence takes more time than contracting the network using that sequence. However, we can speed up index label comparison by temporarily convert to index labels that are faster to compare, such as small integers, before searching for contraction sequences, which is what we already do in the optimal contraction sequence algorithm in ITensors.jl successfully right now anyway (and it makes a big difference in the speed of that code).
Index
tags redesign proposalThis is a proposal for a redesign of the
Index
object, specifically related to tags.There are a few things that bother me about the design of
Index
tags:My proposal for fixing those issues is to redesign the
Index
object to store the tags in aDict{Symbol,String}
:Here is a demonstration of how it might get constructed, printed, etc.:
Definitions
```julia using IterTools: flagfirst @enum Arrow In = -1 Out = 1 Neither = 0 struct Index{Space} space::Space dir::Arrow id::UInt64 plev::Int tags::Dict{Symbol,String} end # Enable syntax `i.p` for property `p` in `metadata`. function Base.getproperty(i::Index, p::Symbol) if p in fieldnames(typeof(i)) return getfield(i, p) end return getfield(i, :tags)[p] end space(i::Index) = getfield(i, :space) plev(i::Index) = getfield(i, :space) id(i::Index) = getfield(i, :id) tags(i::Index) = getfield(i, :tags) map_dict(f, d::AbstractDict) = Dict((i => string(d[i]) for i in eachindex(d))) function Index(space; dir=Neither, id=rand(UInt64), plev=0, kwargs...) metadata = map_dict(string, kwargs) return Index(space, dir, id, plev, metadata) end function Base.propertynames(i::Index) ps = collect(fieldnames(typeof(i))) filter!(p -> p ≠ :tags, ps) append!(ps, eachindex(getfield(i, :tags))) return ps end function Base.:(==)(i1::Index, i2::Index) return (i1.id == i2.id) && (i1.plev == i2.plev) && (i1.metadata == i2.metadata) end function Base.show(io::IO, i::Index) print(io, "Index(") for (isfirst, p) in IterTools.flagfirst(propertynames(i)) if !isfirst print(io, "|") end print(io, p, "=", getproperty(i, p)) end print(io, ")") return io end ```
This design would be helpful for designing infinite tensor networks, such as the design used in ITensorInfiniteMPS.jl where a "cell" tag is used to mark which unit cell a site of the infinite MPS is in, which helps with implementing lazy infinite indexing where only the tensors in the first unit cell are stored. Generalizing that design to higher dimensions, where you want to store cell values for each dimension, puts a big strain on the current tag design, while it would fit naturally with this new design proposal.
There are of course very reasonable alternative choices for the storage of tags that have the same benefits, but have some other tradeoffs:
NamedTuple
instead of aDict
to store the tags. An advantage of that design is that you could more naturally have non-homogeneous tag types (i.e. some tags could be strings, while others could be integers). Also,NamedTuple
is statically sized and therefore could be stack allocated, depending on the types of the tags. A concern I have with that design is that if an ITensor has indices with different tag categories (i.e. one hasn=1
but another doesn't specifyn
) then the Indices don't all have the same types, which could cause some type stability issues (which could cause performance issues in certain cases). Relatedly,NamedTuple
would generally put more strain on the compiler, potentially having an impact on compile times.tags
Dict
could be something besides the types I proposed above. It seems reasonable for the keys to beSymbol
and the values to beString
but we should test out different options to see if there are issues with performance. I think it was a mistake in terms of usability to use a fixed sized string for the tags since it is very awkward when they overflow and it isn't easy to give users control over the maximum length, I think storing them as eitherString
orSymbol
will be reasonable choices. Choosing betweenString
andSymbol
is a subtle choice,Symbol
s are basically interned strings so require less storage and have faster hashing/comparison, so are natural choices for the keys and could be good choices for the values.String
s are easier to manipulate/iterate on so would be a nicer choice for users.A look ahead to redesigning around named ranges and
NamedDimArrays
In the future, the
Index
type will likely become a named unit range over the values that the tensor index/mode runs over:Relatedly, the
ITensor
type will likely become a subtype ofAbstractNamedDimArray
, based on the code being developed here: https://github.com/ITensor/ITensors.jl/tree/v0.6.16/NDTensors/src/lib/NamedDimsArrays.That's orthogonal to the proposal in this issue around redesigning tags, but is just a teaser for how the
Index
type will fit into the redesign of NDTensors.jl when we move away from theTensor
/TensorStorage
types. It also gives some indication about how users who don't like the choices we made about theIndex
/ITensor
types, for example theIndex
metadata based around random ID numbers, could define their own array types with named dimensions (i.e. design their ownITensor
type that still shares a lot of the same code, like smart addition and contraction, but makes some other interface choices).@emstoudenmire I'm curious to hear your thoughts on this.
@JoeyT1994 @ryanlevy this may be relevant for ITensorNumericalAnalysis.jl, since you might want to encode information about the function dimension/bits in the
Index
, and this would add more flexibility for doing that.