CQCL / lambeq

A high-level Python library for Quantum Natural Language Processing
https://cqcl.github.io/lambeq-docs
Apache License 2.0
451 stars 108 forks source link

CircuitNotRunError: Circuit corresponding to ResultHandle('3ab8e99f-729a-47aa-b523-f7f16d58e163', 0, 8, 'null') has not been run by this backend instance. #145

Closed fcasillo closed 1 month ago

fcasillo commented 1 month ago

Hello everybody, I'm trying to explore the lambeq library to deal with a multiclass classification task. Following the guidelines on the documentation, I've implemented the following code:

`print(df)

index                                    RequirementText _class_

0 0 system shall interface faculty central server O 1 1 product expected integrate multiple database m... O 2 2 dispute application shall interface merchant i... O 3 3 system shall able operate within business offi... O 4 4 website must fully operational msn tv2 O .. ... ... ... 75 216 system shall used realtor training US 76 217 product shall installed untrained realtor with... US 77 218 user need read user manual able use application US 78 219 product shall easy customer novice skill inter... US 79 220 product shall easy use adjuster collision esti... US

[80 rows x 3 columns]`

def create_multi_diagrams(df, reader): diagrams, targets = [], [] for _, row in enumerate(df.to_numpy()): sentence, target = str(row[1]), str(row[2]) try: diagrams.append(reader.sentence2diagram(sentence)) except Exception as e: print(sentence) print(e) continue if target == "US": targets.append([1.0, 0.0, 0.0, 0.0]) elif target == "SE": targets.append([0.0, 1.0, 0.0, 0.0]) elif target == "O": targets.append([0.0, 0.0, 1.0, 0.0]) elif target == "PE": targets.append([0.0, 0.0, 0.0, 1.0]) return diagrams, targets

cups_diagrams, cups_targets = create_multi_diagrams(df, cups_reader)

`def create_circuits(diagrams, targets, ansatz): circuits = [] new_targets = [] i=0 for diagram in diagrams: try: circuits.append(ansatz(diagram)) except Exception as e: print(e) i += 1 continue new_targets.append(targets[i]) i += 1 return circuits, new_targets

ob_map = {AtomicType.SENTENCE:1, AtomicType.NOUN: 1}

iqpansatz = IQPAnsatz(ob_map, n_layers=1, n_single_qubit_params=2)`

cups_iqp_circuits, cups_iqp_targets = create_circuits(cups_diagrams, cups_targets, iqpansatz)

`def accuracy(y_hat, y): y_true = torch.argmax(y, dim=1) y_pred = torch.argmax(y_hat, dim=1) return accuracy_score(y_true, y_pred)

def precision(y_hat, y): y_true = torch.argmax(y, dim=1) y_pred = torch.argmax(y_hat, dim=1) return precision_score(y_true, y_pred, average='weighted', zero_division=1)

def recall(y_hat, y): y_true = torch.argmax(y, dim=1) y_pred = torch.argmax(y_hat, dim=1) return recall_score(y_true, y_pred, average='weighted', zero_division=1)

def f1score(y_hat, y): y_true = torch.argmax(y, dim=1) y_pred = torch.argmax(y_hat, dim=1) return f1_score(y_true, y_pred, average='weighted', zero_division=1)

eval_metrics = {"prec": precision, "rec": recall, "acc": accuracy, "f1": f1score}`

`from pytket.extensions.qiskit import AerBackend from lambeq import BinaryCrossEntropyLoss

backend = AerBackend() backend_config = { 'backend': backend, 'compilation': backend.default_compilation_pass(2), 'shots': 8192 }

model = TketModel.from_diagrams(cups_iqp_circuits, backend_config=backend_config)

trainer = QuantumTrainer( model, loss_function=BinaryCrossEntropyLoss(), epochs=20, optimizer=NelderMeadOptimizer, optim_hyperparams={'a': 0.05, 'c': 0.06, 'A':0.001*20}, evaluate_functions=eval_metrics, evaluate_on_train=True, verbose = 'text', seed=SEED )

X_train_grid, X_val_grid, y_train_grid, y_val_grid = train_test_split(cups_iqp_circuits, cups_iqp_targets, test_size=1/10, random_state=SEED) train_dataset = Dataset(X_train_grid, y_train_grid, batch_size=64) val_dataset = Dataset(X_val_grid, y_val_grid, shuffle=False)

trainer.fit(train_dataset, val_dataset, log_interval = 1)`

Everything works fine, untile on the trainer.fit(train_dataset, val_dataset, log_interval = 1) I get the following error:

`--------------------------------------------------------------------------- CircuitNotRunError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/pytket/extensions/qiskit/backends/aer.py in get_result(self, handle, **kwargs) 343 try: --> 344 return super().get_result(handle) 345 except CircuitNotRunError:

14 frames CircuitNotRunError: Circuit corresponding to ResultHandle('3ab8e99f-729a-47aa-b523-f7f16d58e163', 0, 8, 'null') has not been run by this backend instance.

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/pytket/extensions/qiskit/backends/aer.py in get_result(self, handle, **kwargs) 353 backresults = qiskit_result_to_backendresult(res) 354 for circ_index, backres in enumerate(backresults): --> 355 self._cache[ResultHandle(jobid, circ_index, qubit_n, ppc)][ 356 "result" 357 ] = backres

KeyError: ResultHandle('3ab8e99f-729a-47aa-b523-f7f16d58e163', 1, 8, 'null')`

Any suggestion on how to modify the code to make it work?

neiljdo commented 1 month ago

Hi @fcasillo,

I'm taking a look at this. First, can you try the following ob_map for your ansatz?

ob_map = {AtomicType.SENTENCE: 2, AtomicType.NOUN: 1}

Let me know the results after you've done this, thank you very much!

fcasillo commented 1 month ago

Hi @neiljdo, thank you for your help!

Firstly, I changed the targets representation in the following, I don't know if this helps:

` def create_multi_diagrams(df, reader):

diagrams, targets = [], []
for _, row in enumerate(df.to_numpy()):
    sentence, target = str(row[1]), str(row[2])
    try:
        diagrams.append(reader.sentence2diagram(sentence))
    except Exception as e:
        print(sentence)
        print(e)
        continue
    if target == "US":
        targets.append([0.0, 0.0])
    elif target == "SE":
        targets.append([0.0, 1.0])
    elif target == "O":
        targets.append([1.0, 0.0])
    elif target == "PE":
        targets.append([1.0, 1.0])
return diagrams, targets

`

Then, by following your suggestion, on trainer.fit() function I got the following error:

`

WARNING:qiskit_aer.backends.aerbackend:Simulation failed and returned the following error message: PARTIAL COMPLETED

CircuitNotRunError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/pytket/extensions/qiskit/backends/aer.py in get_result(self, handle, **kwargs) 343 try: --> 344 return super().get_result(handle) 345 except CircuitNotRunError:

15 frames CircuitNotRunError: Circuit corresponding to ResultHandle('fd6a39cb-9ad2-4359-8cd1-4e37425774be', 0, 46, 'null') has not been run by this backend instance.

During handling of the above exception, another exception occurred:

RuntimeError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/pytket/extensions/qiskit/result_convert.py in qiskit_experimentresult_to_backendresult(result, ppcirc) 91 ) -> BackendResult: 92 if not result.success: ---> 93 raise RuntimeError(result.status) 94 95 header = result.header

RuntimeError: ERROR: Insufficient memory to run circuit circuit-166 using the statevector simulator. Required memory: 1073741824M, max memory: 54232M `

So, I have 2 question at this point: 1) How do I choose the ob_map? 2) There is a way to calculate the memory needed to run circuits?

Maybe there are sentences that are hard to handle in my dataset and that I can discard before training by using the simulator.

neiljdo commented 1 month ago

Since you have four classes, you need log2(4) = 2 qubits for the number of qubits assigned to the sentence output, which is s in your case (and in general).

The problem with the above config with the cups_reader is that all the cups also get assigned the s type. This is not a problem for short sentences - for longer ones, you would want to use a different reader which doesn't only use the s type. Have you tried using BobcatParser to create diagrams? You can also use diagram rewrites, especially the RemoveCupsRewriter in tandem with this parser.

I'm not that familiar with the Aer simulator but based on the error message, I think it already does it for you - though you only get notified once you've tried evaluating your circuits e.g. during training.

To summarize, I would suggest:

Also, your previous label encoding was correct - you have to one-hot encode your target labels.

fcasillo commented 1 month ago

I have removed the instances whose number of words is more than 15. Returned to one-hot encoded target labels. I have tried with BobcatParser and also used RemoveCupsRewriter. I have reduced the dataset to just 20 instances to make the problem easier, but still got the same error as in the beginning:

`WARNING:qiskit_aer.backends.aerbackend:Simulation failed and returned the following error message: PARTIAL COMPLETED

CircuitNotRunError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/pytket/extensions/qiskit/backends/aer.py in get_result(self, handle, **kwargs) 343 try: --> 344 return super().get_result(handle) 345 except CircuitNotRunError:

13 frames CircuitNotRunError: Circuit corresponding to ResultHandle('063604d7-6427-4e22-afe9-07503bf9246c', 0, 13, 'null') has not been run by this backend instance.

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/pytket/extensions/qiskit/backends/aer.py in get_result(self, handle, **kwargs) 353 backresults = qiskit_result_to_backendresult(res) 354 for circ_index, backres in enumerate(backresults): --> 355 self._cache[ResultHandle(jobid, circ_index, qubit_n, ppc)][ 356 "result" 357 ] = backres

KeyError: ResultHandle('063604d7-6427-4e22-afe9-07503bf9246c', 1, 13, 'null')`

Do you suggest to change the simulator at this point? Which one would you use?

neiljdo commented 1 month ago

Can you provide the versions of lambeq, pytket and pytket-qiskit you're using? And your Python version, too? Thanks! (I should have asked for these earlier)

fcasillo commented 1 month ago

Of course, here they are:

Python version: 3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0] lambeq version: 0.4.1 pytket version: 1.30.0 pytket-qiskit version: 0.55.0

neiljdo commented 1 month ago

Thank you. In the meantime, can you perform your experiment with a NumpyModel to see if there are issues with your current experimental setup? I have a different version of pytket and related extensions - this might be why I am not getting the same error you have.

fcasillo commented 1 month ago

Thank you for your help! So, I tried with NumpyModel and on training the following is the output.


ValueError Traceback (most recent call last) in <cell line: 30>() 28 val_dataset = Dataset(X_val_grid, y_val_grid, shuffle=False) 29 ---> 30 trainer.fit(train_dataset, val_dataset, log_interval = 1)

8 frames /usr/local/lib/python3.10/dist-packages/lambeq/training/loss.py in _match_shapes(self, y1, y2) 64 y2: np.ndarray | jnp.ndarray) -> None: 65 if y1.shape != y2.shape: ---> 66 raise ValueError('Provided arrays must be of equal shape. Got ' 67 f'arrays of shape {y1.shape} and {y2.shape}.') 68

ValueError: Provided arrays must be of equal shape. Got arrays of shape (18, 2, 2) and (18, 4).


The first one seems to be the dimensions of the circuits (18 instances and 2,2 the outputs). I tried to set the Atomic.Types.SENTENCE: 4, and the error changes in:


ValueError: Provided arrays must be of equal shape. Got arrays of shape (18, 2, 2, 2, 2) and (18, 4).


As last test I tried to set again Atomic.Types.SENTENCE: 2 but encoded the targets in the following way:

if target == "US": targets.append([[1.0, 0.0],[0.0, 0.0]]) elif target == "SE": targets.append([[0.0, 1.0],[0.0, 0.0]]) elif target == "O": targets.append([[0.0, 0.0],[1.0, 0.0]]) elif target == "PE": targets.append([[0.0, 0.0],[0.0, 1.0]])

And I got the following error:


TypeError Traceback (most recent call last) in <cell line: 30>() 28 val_dataset = Dataset(X_val_grid, y_val_grid, shuffle=False) 29 ---> 30 trainer.fit(train_dataset, val_dataset, log_interval = 1)

3 frames in precision(y_hat, y) 5 6 def precision(y_hat, y): ----> 7 y_true = torch.argmax(y, dim=1) 8 y_pred = torch.argmax(y_hat, dim=1) 9 return precision_score(y_true, y_pred, average='weighted', zero_division=1)

TypeError: argmax(): argument 'input' (position 1) must be Tensor, not numpy.ndarray


Am I cursed?

neiljdo commented 1 month ago

Hi @fcasillo

With the switch to the NumpyModel all the tensor outputs will be of type np.ndarray - you just need to update your metrics to use the equivalent numpy functions instead of torch.

fcasillo commented 1 month ago

It worked! Thank you so much! I tried again with the AER simulator, but KeyError: ResultHandle('6925cc08-282a-46d1-9f7c-f95166382196', 1, 13, 'null') persists. Any idea on how can I exploit a quantum simulator? I was also planning to try IonQ platform to run the experiments on real quantum hardware, but at least I would to make a simulator work. Maybe I try directly with the one provided by IonQ?

neiljdo commented 1 month ago

I'm still investigating the issue with the Aer simulator - maybe downgrading your pytket and related extensions, e.g. pytket-qiskit, could be a temporary fix. Here are my version numbers:

pytket==1.22.0
pytket-qiskit==0.44.0
qiskit==0.44.1
qiskit-aer==0.12.2
qiskit-ibm-provider==0.7.0
qiskit-ibm-runtime==0.12.2
qiskit-ibmq-provider==0.20.2
qiskit-terra==0.25.1

tket also has this page listing the backends, including simulators, they support - https://tket.quantinuum.com/api-docs/extensions.html. I'm not sure if pytket supports IonQ, though.

fcasillo commented 1 month ago

I don't know if this can be an issue, but I'm testing everything on Colab. Said that, I tested your versions of libraries and this is the result on training the model:


TypeError Traceback (most recent call last) in <cell line: 29>() 27 val_dataset = Dataset(X_val_grid, y_val_grid, shuffle=False) 28 ---> 29 trainer.fit(train_dataset, val_dataset, log_interval = 1)

8 frames /usr/local/lib/python3.10/dist-packages/lambeq/backend/quantum.py in eval(self, backend, mixed, contractor, *others, *params) 309 for i, circuit in enumerate(circuits): 310 n_bits = len(circuit.post_processing.dom) --> 311 result = np.zeros((n_bits * (2, ))) 312 for bitstring, count in counts[i].items(): 313 result[bitstring] = count

TypeError: Cannot interpret '2' as a data type


Following the suggestions in another discussion I tried to rewrite the diagrams in the following manner:

from lambeq import UnifyCodomainRewriter from lambeq import RemoveCupsRewriter

unify_codomain = UnifyCodomainRewriter() remove_cups = RemoveCupsRewriter()

bob_diagrams = [unify_codomain(remove_cups(diagram)) for diagram in bobcat_diagrams]

and added the discard=True to the function of IQPAnsatz. Last test I did is to one-hot encode the targets as done at the start of discussion, but no one of them worked, getting always the same error.

neiljdo commented 1 month ago

I'm working on fixing that bug (the TypeError bug) - the fix should be available in the next release of lambeq.

If you're keen on playing with the simulator backends, you could install this dirty patch I made to address the above issue:

pip install git+https://github.com/neiljdo/lambeq-public.git@hotfix#egg=lambeq

YMMV but rest assured, this issue should be fixed in a future release.

fcasillo commented 1 month ago

Good morning @neiljdo! This version works, even though I had to update the metrics to use numpy. Thank you for your big help! Last question that is unrelated with this issue: I noticed the simulator is somehow slow when it comes to deal with more instances, so there is a way to run it on GPU?

neiljdo commented 1 month ago

Hi @fcasillo

You can use NumpyModel with JIT (following the instructions here). There's a note there, though, that it's not advisable to use JIT for large models, e.g. high qubit count, due to larger memory requirements.

For the other backends, you have to go to their specific docs to see if they enable GPU acceleration. For example, Pennylane has this lightning.gpu device.

fcasillo commented 1 month ago

Thanks again for your assistance and suggestions! Best of luck, see you next time!