Batched run for QML applications (TensorFlow)

jackaraz commented 2 years ago

Feature details

Hi, I'm trying to implement a possible batched run for QML applications following our chat with @josh146 in here. I was testing my circuit with @qml.batch_params to see how this works. So by default it takes inputs (non-trainable) of (n_batch, n_qubit) and weights (trainable) (n_weights, n_batch) and when I set @qml.batch_params(all_operations=False) it only accepts batched trainables. This is a bit counter-intuitive to me because in a classical ML application I would batch the inputs (non-trainable) and run them through my network with the same weights and then take the derivative according to the mean/sum loss etc. Batching the input would allow much faster training and might be easier to parallelize. Additionally, neither pennylane/qnn/keras.py nor torch.py allows batched run; they both unbatch the input and then execute the circuit (There might be a reason for this that I'm not aware of, please let me know if this is intentional).

Implementation

Below I attached a simple implementation of the batched input option. Note that this works for me but don't think it's general enough where if you change the order of trainable and non-trainable inputs this will probably cause problems.

@qml.batch_transform
def batch_input(tape: qml.tape.JacobianTape):
    """
    At the time of this project, pennylane does not have an implementation to batch only
    non-trainable inputs of a circuit which is the default application for classical ML.
    This function is designed to batch only the non-trainable inputs and submit to the device.

    Note that this function is not generic at the moment all the non-trainable inputs needs to
    come before the trainable inputs.
    TODO: find a way to generalize

    *Example*

        .. code-block:: python

            dev = qml.device("default.qubit", wires = 2, shots=None)
            @batch_input
            @qml.qnode(dev, diff_method="parameter-shift", interface="tf")
            def circuit(inputs, weights):
                qml.AngleEmbedding(inputs, wires = range(2), rotation="Y")
                qml.RY(weights[0], wires=0)
                qml.RY(weights[1], wires=1)
                return qml.expval(qml.PauliZ(1))

            >>> x = np.random.uniform(0,1,(10,2))
            >>> x.requires_grad = False
            >>> w = np.random.uniform(0,1,2)
            >>> circuit(x, w)
            <tf.Tensor: shape=(10,), dtype=float64, numpy=
            array([0.17926078, 0.7480163 , 0.47816999, 0.50381628, 0.349178  ,
                   0.17511444, 0.03769436, 0.19180259, 0.75867188, 0.55335748])>

    Parameters
    ----------
    tape : qml.tape.JacobianTape

    Returns
    -------
    Sequence[Sequence[qml.tape.JacobianTape], Callable]
        list of tapes arranged according to unbatched inputs and a callable function
        to batch the results.

    """
    parameters = tape.get_parameters(trainable_only=False)

    assert len(np.unique([qml.math.shape(x)[0] for x in parameters if not x.requires_grad])) == 1

    output = []
    for inputs in zip(*[x for x in parameters if not x.requires_grad]):
        output += [list(inputs) + [weights for weights in parameters if weights.requires_grad]]

    # Construct new output tape with unstacked inputs
    output_tapes = []
    for params in output:
        new_tape = tape.copy(copy_operations=True)
        new_tape.set_parameters(params, trainable_only=False)
        output_tapes.append(new_tape)

    return output_tapes, lambda x: qml.math.squeeze(qml.math.stack(x))

However, this implementation won't work in a Keras layer simply because tf.Tensor objects does not have requires_grad attribute so I added an additional function just for TensorFlow;

@qml.batch_transform
def batch_input_tf(tape: qml.tape.JacobianTape):
    parameters = tape.get_parameters(trainable_only=False)

    unstacked_inpt = tf.unstack(parameters[0])
    output = [ [x] + parameters[1:] for x in unstacked_inpt ]

    # Construct new output tape with unstacked inputs
    output_tapes = []
    for params in output:
        new_tape = tape.copy(copy_operations=True)
        new_tape.set_parameters(params, trainable_only=False)
        output_tapes.append(new_tape)

    return output_tapes, lambda x: qml.math.squeeze(qml.math.stack(x))

and within class KerasLayer(Layer): qnode can be initialized via self.batched_qnode = batch_input_tf(qnode) then the call function can be modified as follows;

    def call(self, inputs):
        if len(tf.shape(inputs)) == 1:
            inputs = tf.expand_dims(inputs, 0)

        kwargs = {**{QuantumLayer._input_arg: inputs},
                  **{k: 1.0 * w for k, w in self.qnode_weights.items()}}

        return self.batched_qnode(**kwargs)

How important would you say this feature is?

3: Very important! Blocking work.

Additional information

Please note that these implementations are not generic due to the reasons mentioned above. Also please let me know if this implementation would mess up some internal calculations within PennyLane, I just started using it a couple of days ago so not very familiar with the entire construction of the module.

Thanks Jack

CatalinaAlbornoz commented 2 years ago

Hi @jackaraz, thank you for creating this issue! It's not obvious whether or not this will mess up any internal calculations. The best way to know would be to create a Pull Request. This way when the CI checks are run we can see if something is wrong.

Please let me know if you have any trouble creating the PR.

jackaraz commented 2 years ago

This issue has been resolved in PR #2069

PennyLaneAI / pennylane