Closed mstechly closed 2 months ago
@tanujkhattar is out for a while and would be the best person to investigate this
Each Bloq
has class attributes. StatePreparationAliasSampling
has an attribute:
selection_registers: Tuple[SelectionRegister, ...] = attrs.field(
converter=lambda v: (v,) if isinstance(v, SelectionRegister) else tuple(v)
)
which isn't actually documented (@tanujkhattar please address when you're back); but it presumably lets you re-configure the Register
s used for selection. Crucially, this/these (?) registers contain the iteration_length
that gives the range of indices that will appear in the selection register.
It doesn't look like this feature is used within the qualtran codebase: usually the from_lcu_probs
factory method is used instead which uses one register named "selection".
Meanwhile: the serialization code only handles certain types when used as class attributes: https://github.com/quantumlib/Qualtran/blob/main/qualtran/protos/args.proto#L37
We'd need to add support for class attributes that are Register
(SelectionRegister
) types.
How urgent is this? In the meantime: you could patch it ~sortof like
diff --git a/qualtran/bloqs/state_preparation.py b/qualtran/bloqs/state_preparation.py
index 718095af..016fa6b0 100644
--- a/qualtran/bloqs/state_preparation.py
+++ b/qualtran/bloqs/state_preparation.py
@@ -84,9 +84,8 @@ class StatePreparationAliasSampling(PrepareOracle):
(https://arxiv.org/abs/1805.03662).
Babbush et. al. (2018). Section III.D. and Figure 11.
"""
- selection_registers: Tuple[SelectionRegister, ...] = attrs.field(
- converter=lambda v: (v,) if isinstance(v, SelectionRegister) else tuple(v)
- )
+ sel_bitsize: int
+ sel_range: int
alt: NDArray[np.int_]
keep: NDArray[np.int_]
mu: int
@@ -109,12 +108,21 @@ class StatePreparationAliasSampling(PrepareOracle):
)
N = len(lcu_probabilities)
return StatePreparationAliasSampling(
- selection_registers=SelectionRegister('selection', (N - 1).bit_length(), N),
+ sel_bitsize=(N - 1).bit_length(),
+ sel_range=N,
alt=np.array(alt),
keep=np.array(keep),
mu=mu,
)
+ @property
+ def selection_registers(self) -> Tuple[SelectionRegister, ...]:
+ return (
+ SelectionRegister(
+ 'selection', bitsize=self.sel_bitsize, iteration_length=self.sel_range
+ ),
+ )
+
@cached_property
def sigma_mu_bitsize(self) -> int:
return self.mu
@@ -158,7 +166,7 @@ class StatePreparationAliasSampling(PrepareOracle):
) -> cirq.OP_TREE:
selection, less_than_equal = quregs['selection'], quregs['less_than_equal']
sigma_mu, alt, keep = quregs.get('sigma_mu', ()), quregs['alt'], quregs.get('keep', ())
- N = self.selection_registers[0].iteration_length
+ N = self.sel_range
yield PrepareUniformSuperposition(N).on(*selection)
yield cirq.H.on_each(*sigma_mu)
qrom_gate = QROM(
Thank you! I'll test these patches and see if I run into any other issues!
Unfortunately with the changes you suggested I got the following:
File ~/.../Qualtran/qualtran/serialization/args.py:49, in arg_to_proto(name, val)
47 if isinstance(val, cirq.Gate):
48 return args_pb2.BloqArg(name=name, cirq_json_gzip=cirq.to_json_gzip(val))
---> 49 raise ValueError(f"Cannot serialize {val} of unknown type {type(val)}")
ValueError: Cannot serialize () of unknown type <class 'tuple'>
So it turns out that arg_to_proto
doesn't handle two types of data which appear in this Bloq:
()
[array([2, 2, 3, 3]), array([5, 4, 7, 0])]
So I added this extremely unsafe and hacky logic:
if isinstance(val, tuple):
return args_pb2.BloqArg(name=name, ndarray=_ndarray_to_proto(np.ndarray(val)))
if isinstance(val, list) and len(val) != 0 and isinstance(val[0], np.ndarray):
return args_pb2.BloqArg(name=name, ndarray=_ndarray_to_proto(np.stack(val)))
The first one is wrong because apparently np.ndarray(())
creates an array with only 0
in it – I guess some version of a protobuf null would be more appropriate here.
The second one might be giving correct results, this is what I get when I print out the protobuf object:
bloq {
name: "QROM"
args {
name: "data"
ndarray {
shape: 2
shape: 4
dtype: "np.dtype(\'int64\')"
data: "\002\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\003\000\000\000\000\000\000\000\003\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\007\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"
}
}
Also, when I tried to load it back with bloqs_from_proto(my_proto_obj)
I got:
ValueError: Unable to find a Bloq corresponding to bloq_proto.bloq.name='StatePreparationAliasSampling'
So I added the following entries to the RESOLVER_DICT
in serialization/bloq.py
:
'StatePreparationAliasSampling': StatePreparationAliasSampling,
'PrepareUniformSuperposition': PrepareUniformSuperposition,
and then this hit me:
File "<attrs generated init qualtran.bloqs.prepare_uniform_superposition.PrepareUniformSuperposition>", line 4, in __init__
_setattr('cvs', __attr_converter_cvs(cvs))
File "/.../Qualtran/qualtran/bloqs/prepare_uniform_superposition.py", line 52, in <lambda>
converter=lambda v: (v,) if isinstance(v, int) else tuple(v), default=()
TypeError: iteration over a 0-d array
So it looks like the fact that cvs
has not been serialized properly hits me back. If you could help me out that would be great, as it seems that doing this serde properly would require some protobuf learning.
I hope it's helpful :)
Extra comment – the logic with RESOLVER_DICT
in bloq_id_to_bloq
seems a bit suspicious to me.
First, it will fail for any bloqs which are not on the list
Second, I think it should be handling Alias Sampling just right cause it's a composite bloq. Actually, I thought it is, cause I can use decompose_bloq
on it. But then I checked and it actually isn't. So I'm obviously wrong – so the logic works kind of makes sense, but I'm just letting you know it's a bit confusing for an outside user on how these things are structured.
So I wanted to ask – is the first thing by design? I can imagine you might want to restrict the set of available basic bloqs for deserialization. But on the other hand I can also imagine this being temporary solution that will be replaced by something more robust (e.g. autogenerated or user-provided RESOLVER_DICT
) ?
We synced offline, but capturing here.
CompositeBloq
. re: serialization: @tanujkhattar would be the best person to fix this properly but he is out for a bit. The original design of the serialization was supposed to be restrictive about what types of values you could include in bloq attributes. But we never actually tested that so there are now bloqs that include additional types of values in their attributes.
I like your hack; but instead of hacking things into ndarray; you could try json.dumps
-ing the attributes to the string_val
field in the BloqArg
proto message. You'd have to deserialize it too. The following is my idea but untested
--- a/qualtran/serialization/args.py
+++ b/qualtran/serialization/args.py
@@ -11,6 +11,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+import json
from typing import Any, Dict, Union
import cirq
@@ -38,15 +39,13 @@ def arg_to_proto(*, name: str, val: Any) -> args_pb2.BloqArg:
return args_pb2.BloqArg(name=name, int_val=val)
if isinstance(val, float):
return args_pb2.BloqArg(name=name, float_val=val)
- if isinstance(val, str):
- return args_pb2.BloqArg(name=name, string_val=val)
if isinstance(val, sympy.Expr):
return args_pb2.BloqArg(name=name, sympy_expr=str(val))
if isinstance(val, np.ndarray):
return args_pb2.BloqArg(name=name, ndarray=_ndarray_to_proto(val))
if isinstance(val, cirq.Gate):
return args_pb2.BloqArg(name=name, cirq_json_gzip=cirq.to_json_gzip(val))
- raise ValueError(f"Cannot serialize {val} of unknown type {type(val)}")
+ return args_pb2.BloqArg(name=name, string_val=json.dumps(val))
def arg_from_proto(arg: args_pb2.BloqArg) -> Dict[str, Any]:
@@ -55,7 +54,7 @@ def arg_from_proto(arg: args_pb2.BloqArg) -> Dict[str, Any]:
if arg.HasField("float_val"):
return {arg.name: arg.float_val}
if arg.HasField("string_val"):
- return {arg.name: arg.string_val}
+ return {arg.name: json.loads(arg.string_val)}
if arg.HasField("sympy_expr"):
return {arg.name: parse_expr(arg.sympy_expr)}
if arg.HasField("ndarray"):
This approach immediately throws: *** TypeError: Object of type ndarray is not JSON serializable
.
Ok, so I made the following changes to make it work:
I added custom np json decoder (source: https://pynative.com/python-serialize-numpy-ndarray-into-json/) to deal with np arrays, so the line of interest looks like this:
return args_pb2.BloqArg(name=name, string_val=json.dumps(val, cls=NumpyArrayEncoder))
There were still some issues, so in the end instead of passing coeffs
(input to StatePreparationAliasSampling.from_lcu_probs
) as np.array
I just passed them as a list.
In QROM
the __attrs_post_init__
complained a bit, so I fixed it by casting d
to numpy array:
shapes = [np.array(d).shape for d in self.data]
And removed two last assertions for checking if self.selection_bitsizes
and self.target_bitsizes
are tuples, as after deserialization they ended up being lists.
With all those changes when I do:
proto_stuff = bloqs_to_proto(state_prep)
reconstructed = bloqs_from_proto(proto_stuff)
I get the following:
(Pdb) reconstructed
[StatePreparationAliasSampling(sel_bitsize=2, sel_range=4, alt=array([2, 2, 3, 3]), keep=array([5, 4, 7, 0]), mu=3), PrepareUniformSuperposition(n=4, cvs=()), Split(n=3), CirqGateAsBloq(gate=cirq.H), QROM(data=[[2, 2, 3, 3], [5, 4, 7, 0]], selection_bitsizes=[2], target_bitsizes=[2, 3], num_controls=0), Join(n=3), LessThanEqual(x_bitsize=3, y_bitsize=3), CSwap(bitsize=2)]
(Pdb) state_prep
StatePreparationAliasSampling(sel_bitsize=2, sel_range=4, alt=array([2, 2, 3, 3]), keep=array([5, 4, 7, 0]), mu=3)
Which is a bit surprising as I was expecting only one output, but I guess since:
state_prep == reconstructed[0]
yields True
, this is fine.
Just FYI as another minor unintuitive thing :)
So I can do my stuff and you have a list of minor issues to fix, so I think we can call it a success 🎉 !
Props for powering through.
bloqs_to_proto
will construct a BloqLibrary
proto message which has multiple bloqs in it. This needs to be documented (https://github.com/quantumlib/Qualtran/issues/333) but when you ask to serialize StatePreparationAliasSampling
it will serialize that bloq (ie its attributes/signature) and its decomposition (ie a DAG of subbloqs). The subbloqs (ie attributes/signature) will also be serialized in the BloqLibrary
as the nodes (or node data, depending on how you think about it) in the decomposition DAG. There's a sneaky argument max_depth
that controls how deep we go.
Admittedly, I didn't bother to read the documentation of bloqs_to_proto
😅
But it makes sense 👌
That's probably for the best as there currently isn't any 💀
Sorry @mstechly, the initial prototype for serialization was added a while ago and it hasn't kept up with all the new things we've added to Qualtran. I appreciate your patience for powering through the rough edges!
Right now, SelectionRegister
serialization is not supported because SelectionRegister
is supposed to be deprecated and removed soon after we implement the more general Quantum Data Types in Qualtran proposal.
Is full serialization support a priority for you to unblock ongoing work or was it a oneoff experiment? If it's not a priority, I'll hold off adding serialization support for SelectionRegister (and this implies all Unary Iteration derived bloqs would run into an error).
Not a priority, thanks @tanujkhattar !
Has this been fixed?
Yes, this is now fixed. Serialization of symbolic alias sampling is blocked on serializing the Shaped
object; but that's independent of the bloq serialization overall. Non symbolic alias sampling bloqs serialize fine now and we have tests that verify this. I think this can be closed. We can track getting rid of the long list of "not yet serializable" blocks in a separate issue
I was trying to take serialize Alias Sampling:
However I got the following error:
ValueError: Cannot serialize (SelectionRegister(name='selection', bitsize=2, iteration_length=4, shape=(), side=<Side.THRU: 3>),) of unknown type <class 'tuple'>