quantumlib / Cirq

A Python framework for creating, editing, and invoking Noisy Intermediate Scale Quantum (NISQ) circuits.
Apache License 2.0
4.28k stars 1.02k forks source link

Fix #6706 – Update sources for compatibility with NumPy 2 #6724

Closed mhucka closed 1 month ago

mhucka commented 2 months ago

This is a set of changes necessary to address issue #6706 and support Cirq migration to NumPy 2. The result makes Cirq compatible with NumPy 2 and 1, with the exception of the cirq-rigetti module, which at this time has an incompatible requirement for NumPy 1.

The changes target NumPy 2.0.2 rather than NumPy 2.1. At this time, some package dependency conflicts arise from other packages used by Cirq when NumPy 2.1 is required. This currently limits us to 2.0.2. (Note for the Google Quantum team: Google's internal codebase is about to move to NumPy 2.0.2, not 2.1, so the inability of supporting NumPy 2.1 is not a problem with respect to this impending change.)

The rest of this text summarizes the changes in this PR.

Avoid a construct deprecated in NumPy 2

The NumPy 2 Migration Guide explicitly recommends changing constructs of the form np.array(state, copy=False) to np.asarray(state).

Avoid implicitly converting 2-D arrays of single value to scalars

NumPy 2 raises deprecation warnings about converting an ndarray with dimension > 0 of values likle [[0]] to a scalar value like 0. The solution is to retrieve the value using .item() instead.

Address change in NumPy string representation of scalars

As a consequence of NEP 51, the string representation of scalar numbers changed in NumPy 2 to include type information. This affected printing Cirq circuit diagrams: instead seeing numbers like 1.5, you would see np.float64(1.5) and similar.

The solution is to use .item() on scalars before passing them to anything that needs to use the scalar's string representation (via str or __repr__()). So, for example, if x is an np.int64 object, x.item() returns the Python object, and then the string form of that looks normal.

Explicitly convert NumPy ndarray of np.bool to Python bool

In NumPy 2 (and possibly earlier versions), lines 478-480 in cirq-google/cirq_google/api/v2/program_pb2.pyi produced a deprecation warning:

DeprecationWarning: In future, it will be an error
for 'np.bool' scalars to be interpreted as an index

This warning is somewhat misleading: while it is the case that Booleans are involved, they are not being used as indices.

The fields rs, xs, and zs of CliffordTableau as defined in file cirq-core/cirq/qis/clifford_tableau.py have type Optional[np.ndarray], and the values in the ndarray have NumPy type bool in practice. The protocol buffer version of CliffordTableau defined in file cirq-google/cirq_google/api/v2/program_pb2.pyi defines those fields as collections.abc.Iterable[builtins.bool]. At first blush, you might think they're arrays of Booleans in both cases, but unfortunately, there's a wrinkle: Python defines its built-in bool type as being derived from int (see PEP 285), while NumPy explicitly does not drive its bool from its integer class (see https://numpy.org/doc/2.0/reference/arrays.scalars.html#numpy.bool). The warning about converting np.bool to index values (i.e., integers) probably arises when the np.bool values in the ndarray are coerced into Python Booleans.

At first, I thought the obvious solution would be to use np.asarray to convert the values to builtins.bool, but this did not work:

>>> import numpy as np
>>> import builtins
>>> arr = np.array([True, False], dtype=np.bool)
>>> arr
array([ True, False])
>>> type(arr[0])
<class 'numpy.bool'>
>>> newarr = np.asarray(arr, dtype=builtins.bool)
>>> newarr
array([ True, False])
>>> type(newarr[0])
<class 'numpy.bool'>

They still end up being NumPy bools. Some other variations on this approach all failed to produce proper Python Booleans. In the end, what worked was to use map() to apply builtins.bool to every value in the incoming arrays. This may not be as efficient as possible; a possible optimization for the future is to look for a more efficient way to cast the types, or avoid having to do it at all.

Address changes in NumPy data type promotion

Note added 2024-09-20: Pavol reasoned convincingly that it would be better to pull the non-essential NumPy 2 type warnings to a separate PR at a later date, and focus this PR on only essential compatibility issues. Pavol amended the PR accordingly. Consequently, the changes described in this section are mostly not part of the final PR. This text is being left in place because it provides details that may be useful in the future PR.

One of the changes in NumPy 2 is to the behavior of type promotion. A possible negative impact of the changes is that some operations involving scalar types can lead to lower precision, or even overflow. For example, uint8(100) + 200 previously (in Numpy < 2.0) produced a unit16 value, but now results in a unit8 value and an overflow warning (not error). This can have an impact on Cirq. For example, in Cirq, simulator measurement result values are uint8's, and in some places, arrays of values are summed; this leads to overflows if the sum > 128. It would not be appropriate to change measurement values to be larger than uint8, so in cases like this, the proper solution is probably to make sure that where values are summed or otherwise numerically manipulated, uint16 or larger values are ensured.

NumPy 2 offers a new option (np._set_promotion_state("weak_and_warn")) to produce warnings where data types are changed. Commit https://github.com/quantumlib/Cirq/commit/6cf50eb382cbff86c599ccafe50886aef20afa51 adds a new command-line to our pytest framework, such that running

check/pytest --warn-numpy-data-promotion

will turn on this NumPy setting. Running check/pytest with this option enabled revealed quite a lot of warnings. The present commit changes code in places where those warnings were raised, in an effort to eliminate as many of them as possible.

It is certainly the case that not all of the type promotion warnings are meaningful. Unfortunately, I found it sometimes difficult to be sure of which ones are meaningful, in part because Cirq's code has many layers and uses ndarrays a lot, and understanding the impact of a type demotion (say, from float64 to float32) was difficult for me to do. In view of this, I wanted to err on the side of caution and try to avoid losses of precision. The principles followed in the changes are roughly the following:

It is likely that this approach resulted in some unnecessary up-promotion of values and may have impacted run-time performance. Some simple overall timing of check/pytest did not reveal a glaring negative impact of the changes, but that doesn't mean real applications won't be impacted. Perhaps a future review can evaluate whether speedups are possible.

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 96.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 97.83%. Comparing base (484df6f) to head (cab0fb2). Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
conftest.py 80.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #6724 +/- ## ========================================== - Coverage 97.83% 97.83% -0.01% ========================================== Files 1077 1077 Lines 92524 92537 +13 ========================================== + Hits 90523 90535 +12 - Misses 2001 2002 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.