zapatacomputing / benchq

Resource estimation for fault-tolerant quantum computation.
Apache License 2.0
53 stars 12 forks source link

possible speedup: It might be worthwhile moving away from the Set{int} datastructure since there with lengths and collects #54

Open shobhan126 opened 1 year ago

shobhan126 commented 1 year ago

The set data structure could perhaps be changed to a SparseMatrix representation for AdjacencyMatrix?

Also UniqueVectors.jl could be potentially useful here.

https://github.com/zapatacomputing/benchq/blob/d4872296dc3097739b6a05aa4120cb9984ac3efb/src/benchq/compilation/graph_sim_mini.jl#L38

the collect call is going to make a new copy of this set; do we have an estimate on how big these sets become? https://github.com/zapatacomputing/benchq/blob/d4872296dc3097739b6a05aa4120cb9984ac3efb/src/benchq/compilation/graph_sim_mini.jl#L146-L150

Is there a reason for the collect call before iterating over this like so? I am guessing the logic here is that adjacency matrix is supposed to be symmetric? A' = A ? would this be simplified by just using the Iterators.product(neighbors, neighbors) and modifying the toggle_edge(adj, v1, v2) to only delete a single edge from the adjacency list at a time? validation for symmetry can be ensure in the generation step so we don't have to wrangle ourselves by collecting the set and trying to setup a nested for loop

https://github.com/zapatacomputing/benchq/blob/d4872296dc3097739b6a05aa4120cb9984ac3efb/src/benchq/compilation/graph_sim_mini.jl#L168-L176

also toggle_edge!(...) since it mutates the adjacency list.

we should also get rid of this deepcopy and pop call? I am not fully sure of the logic. Since this is a set, which is unordered, how are we ensuring that the pop! is getting out the correct item? is there an implied / expected size of adj[v]?

https://github.com/zapatacomputing/benchq/blob/d4872296dc3097739b6a05aa4120cb9984ac3efb/src/benchq/compilation/graph_sim_mini.jl#L128-L130

Has the interface SimpleGraph Graphs.jl ever been explored? https://juliagraphs.org/Graphs.jl/dev/ecosystem/interface/

Maybe we can reduce the indexing by 1 / indexing by 0 headache with OffsetArrays.jl see

https://github.com/zapatacomputing/benchq/blob/d4872296dc3097739b6a05aa4120cb9984ac3efb/src/benchq/compilation/graph_sim_mini.jl#L249 Instead of creating a pylist(pylist([...] .-1)) we can just use an offset array functionality so we don't have to worry about this

AthenaCaesura commented 1 year ago

Some of the above the issues raised here were addressed in the above PR. But to answer your questions: I'm not sure which data structure is best for this, we are actively looking into different data structures, but it seems hard to find on that can easily handle all the toggles we are doing. The sets usually remain very small relative to the size of the graph. Usualy less than a thousandth of the size. We have gotten rid of the collect call. Using the symmetry of the matrix to reduce the cost of toggling probably won't be possible. The reasons for this are a little complex, but if you want to chat about it, I'd be happy to. SimpleGraph probably won't be great, because we want to use a sparse adjacency list to represent the graph. We might be able to write our own, but that's a lot of extra work. I don't think using pylist really affects performance right now, so I don't think I'll worry about it too much.