Closed fcasillo closed 1 month ago
Hi @fcasillo,
I'm taking a look at this. First, can you try the following ob_map
for your ansatz?
ob_map = {AtomicType.SENTENCE: 2, AtomicType.NOUN: 1}
Let me know the results after you've done this, thank you very much!
Hi @neiljdo, thank you for your help!
Firstly, I changed the targets representation in the following, I don't know if this helps:
` def create_multi_diagrams(df, reader):
diagrams, targets = [], []
for _, row in enumerate(df.to_numpy()):
sentence, target = str(row[1]), str(row[2])
try:
diagrams.append(reader.sentence2diagram(sentence))
except Exception as e:
print(sentence)
print(e)
continue
if target == "US":
targets.append([0.0, 0.0])
elif target == "SE":
targets.append([0.0, 1.0])
elif target == "O":
targets.append([1.0, 0.0])
elif target == "PE":
targets.append([1.0, 1.0])
return diagrams, targets
`
Then, by following your suggestion, on trainer.fit() function I got the following error:
`
CircuitNotRunError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/pytket/extensions/qiskit/backends/aer.py in get_result(self, handle, **kwargs) 343 try: --> 344 return super().get_result(handle) 345 except CircuitNotRunError:
15 frames CircuitNotRunError: Circuit corresponding to ResultHandle('fd6a39cb-9ad2-4359-8cd1-4e37425774be', 0, 46, 'null') has not been run by this backend instance.
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/pytket/extensions/qiskit/result_convert.py in qiskit_experimentresult_to_backendresult(result, ppcirc) 91 ) -> BackendResult: 92 if not result.success: ---> 93 raise RuntimeError(result.status) 94 95 header = result.header
RuntimeError: ERROR: Insufficient memory to run circuit circuit-166 using the statevector simulator. Required memory: 1073741824M, max memory: 54232M `
So, I have 2 question at this point: 1) How do I choose the ob_map? 2) There is a way to calculate the memory needed to run circuits?
Maybe there are sentences that are hard to handle in my dataset and that I can discard before training by using the simulator.
Since you have four classes, you need log2(4) = 2
qubits for the number of qubits assigned to the sentence output, which is s
in your case (and in general).
The problem with the above config with the cups_reader
is that all the cups also get assigned the s
type. This is not a problem for short sentences - for longer ones, you would want to use a different reader which doesn't only use the s
type. Have you tried using BobcatParser
to create diagrams? You can also use diagram rewrites, especially the RemoveCupsRewriter
in tandem with this parser.
I'm not that familiar with the Aer simulator but based on the error message, I think it already does it for you - though you only get notified once you've tried evaluating your circuits e.g. during training.
To summarize, I would suggest:
BobcatParser
with a couple of rewrite rules like RemoveCupsRewriter
Also, your previous label encoding was correct - you have to one-hot encode your target labels.
I have removed the instances whose number of words is more than 15. Returned to one-hot encoded target labels. I have tried with BobcatParser and also used RemoveCupsRewriter. I have reduced the dataset to just 20 instances to make the problem easier, but still got the same error as in the beginning:
CircuitNotRunError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/pytket/extensions/qiskit/backends/aer.py in get_result(self, handle, **kwargs) 343 try: --> 344 return super().get_result(handle) 345 except CircuitNotRunError:
13 frames CircuitNotRunError: Circuit corresponding to ResultHandle('063604d7-6427-4e22-afe9-07503bf9246c', 0, 13, 'null') has not been run by this backend instance.
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/pytket/extensions/qiskit/backends/aer.py in get_result(self, handle, **kwargs) 353 backresults = qiskit_result_to_backendresult(res) 354 for circ_index, backres in enumerate(backresults): --> 355 self._cache[ResultHandle(jobid, circ_index, qubit_n, ppc)][ 356 "result" 357 ] = backres
KeyError: ResultHandle('063604d7-6427-4e22-afe9-07503bf9246c', 1, 13, 'null')`
Do you suggest to change the simulator at this point? Which one would you use?
Can you provide the versions of lambeq
, pytket
and pytket-qiskit
you're using? And your Python version, too? Thanks! (I should have asked for these earlier)
Of course, here they are:
Python version: 3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0] lambeq version: 0.4.1 pytket version: 1.30.0 pytket-qiskit version: 0.55.0
Thank you. In the meantime, can you perform your experiment with a NumpyModel
to see if there are issues with your current experimental setup? I have a different version of pytket
and related extensions - this might be why I am not getting the same error you have.
Thank you for your help! So, I tried with NumpyModel and on training the following is the output.
ValueError Traceback (most recent call last)
8 frames /usr/local/lib/python3.10/dist-packages/lambeq/training/loss.py in _match_shapes(self, y1, y2) 64 y2: np.ndarray | jnp.ndarray) -> None: 65 if y1.shape != y2.shape: ---> 66 raise ValueError('Provided arrays must be of equal shape. Got ' 67 f'arrays of shape {y1.shape} and {y2.shape}.') 68
ValueError: Provided arrays must be of equal shape. Got arrays of shape (18, 2, 2) and (18, 4).
The first one seems to be the dimensions of the circuits (18 instances and 2,2 the outputs). I tried to set the Atomic.Types.SENTENCE: 4
, and the error changes in:
ValueError: Provided arrays must be of equal shape. Got arrays of shape (18, 2, 2, 2, 2) and (18, 4).
As last test I tried to set again Atomic.Types.SENTENCE: 2
but encoded the targets in the following way:
if target == "US": targets.append([[1.0, 0.0],[0.0, 0.0]]) elif target == "SE": targets.append([[0.0, 1.0],[0.0, 0.0]]) elif target == "O": targets.append([[0.0, 0.0],[1.0, 0.0]]) elif target == "PE": targets.append([[0.0, 0.0],[0.0, 1.0]])
And I got the following error:
TypeError Traceback (most recent call last)
3 frames
TypeError: argmax(): argument 'input' (position 1) must be Tensor, not numpy.ndarray
Am I cursed?
Hi @fcasillo
With the switch to the NumpyModel
all the tensor outputs will be of type np.ndarray
- you just need to update your metrics to use the equivalent numpy
functions instead of torch
.
It worked! Thank you so much!
I tried again with the AER simulator, but KeyError: ResultHandle('6925cc08-282a-46d1-9f7c-f95166382196', 1, 13, 'null')
persists. Any idea on how can I exploit a quantum simulator? I was also planning to try IonQ platform to run the experiments on real quantum hardware, but at least I would to make a simulator work. Maybe I try directly with the one provided by IonQ?
I'm still investigating the issue with the Aer simulator - maybe downgrading your pytket
and related extensions, e.g. pytket-qiskit
, could be a temporary fix. Here are my version numbers:
pytket==1.22.0
pytket-qiskit==0.44.0
qiskit==0.44.1
qiskit-aer==0.12.2
qiskit-ibm-provider==0.7.0
qiskit-ibm-runtime==0.12.2
qiskit-ibmq-provider==0.20.2
qiskit-terra==0.25.1
tket also has this page listing the backends, including simulators, they support - https://tket.quantinuum.com/api-docs/extensions.html. I'm not sure if pytket
supports IonQ, though.
I don't know if this can be an issue, but I'm testing everything on Colab. Said that, I tested your versions of libraries and this is the result on training the model:
TypeError Traceback (most recent call last)
8 frames /usr/local/lib/python3.10/dist-packages/lambeq/backend/quantum.py in eval(self, backend, mixed, contractor, *others, *params) 309 for i, circuit in enumerate(circuits): 310 n_bits = len(circuit.post_processing.dom) --> 311 result = np.zeros((n_bits * (2, ))) 312 for bitstring, count in counts[i].items(): 313 result[bitstring] = count
TypeError: Cannot interpret '2' as a data type
Following the suggestions in another discussion I tried to rewrite the diagrams in the following manner:
from lambeq import UnifyCodomainRewriter from lambeq import RemoveCupsRewriter
unify_codomain = UnifyCodomainRewriter() remove_cups = RemoveCupsRewriter()
bob_diagrams = [unify_codomain(remove_cups(diagram)) for diagram in bobcat_diagrams]
and added the discard=True
to the function of IQPAnsatz.
Last test I did is to one-hot encode the targets as done at the start of discussion, but no one of them worked, getting always the same error.
I'm working on fixing that bug (the TypeError
bug) - the fix should be available in the next release of lambeq.
If you're keen on playing with the simulator backends, you could install this dirty patch I made to address the above issue:
pip install git+https://github.com/neiljdo/lambeq-public.git@hotfix#egg=lambeq
YMMV but rest assured, this issue should be fixed in a future release.
Good morning @neiljdo! This version works, even though I had to update the metrics to use numpy. Thank you for your big help! Last question that is unrelated with this issue: I noticed the simulator is somehow slow when it comes to deal with more instances, so there is a way to run it on GPU?
Hi @fcasillo
You can use NumpyModel
with JIT (following the instructions here). There's a note there, though, that it's not advisable to use JIT for large models, e.g. high qubit count, due to larger memory requirements.
For the other backends, you have to go to their specific docs to see if they enable GPU acceleration. For example, Pennylane has this lightning.gpu
device.
Thanks again for your assistance and suggestions! Best of luck, see you next time!
Hello everybody, I'm trying to explore the lambeq library to deal with a multiclass classification task. Following the guidelines on the documentation, I've implemented the following code:
`print(df)
0 0 system shall interface faculty central server O 1 1 product expected integrate multiple database m... O 2 2 dispute application shall interface merchant i... O 3 3 system shall able operate within business offi... O 4 4 website must fully operational msn tv2 O .. ... ... ... 75 216 system shall used realtor training US 76 217 product shall installed untrained realtor with... US 77 218 user need read user manual able use application US 78 219 product shall easy customer novice skill inter... US 79 220 product shall easy use adjuster collision esti... US
[80 rows x 3 columns]`
def create_multi_diagrams(df, reader): diagrams, targets = [], [] for _, row in enumerate(df.to_numpy()): sentence, target = str(row[1]), str(row[2]) try: diagrams.append(reader.sentence2diagram(sentence)) except Exception as e: print(sentence) print(e) continue if target == "US": targets.append([1.0, 0.0, 0.0, 0.0]) elif target == "SE": targets.append([0.0, 1.0, 0.0, 0.0]) elif target == "O": targets.append([0.0, 0.0, 1.0, 0.0]) elif target == "PE": targets.append([0.0, 0.0, 0.0, 1.0]) return diagrams, targets
cups_diagrams, cups_targets = create_multi_diagrams(df, cups_reader)
`def create_circuits(diagrams, targets, ansatz): circuits = [] new_targets = [] i=0 for diagram in diagrams: try: circuits.append(ansatz(diagram)) except Exception as e: print(e) i += 1 continue new_targets.append(targets[i]) i += 1 return circuits, new_targets
ob_map = {AtomicType.SENTENCE:1, AtomicType.NOUN: 1}
iqpansatz = IQPAnsatz(ob_map, n_layers=1, n_single_qubit_params=2)`
cups_iqp_circuits, cups_iqp_targets = create_circuits(cups_diagrams, cups_targets, iqpansatz)
`def accuracy(y_hat, y): y_true = torch.argmax(y, dim=1) y_pred = torch.argmax(y_hat, dim=1) return accuracy_score(y_true, y_pred)
def precision(y_hat, y): y_true = torch.argmax(y, dim=1) y_pred = torch.argmax(y_hat, dim=1) return precision_score(y_true, y_pred, average='weighted', zero_division=1)
def recall(y_hat, y): y_true = torch.argmax(y, dim=1) y_pred = torch.argmax(y_hat, dim=1) return recall_score(y_true, y_pred, average='weighted', zero_division=1)
def f1score(y_hat, y): y_true = torch.argmax(y, dim=1) y_pred = torch.argmax(y_hat, dim=1) return f1_score(y_true, y_pred, average='weighted', zero_division=1)
eval_metrics = {"prec": precision, "rec": recall, "acc": accuracy, "f1": f1score}`
`from pytket.extensions.qiskit import AerBackend from lambeq import BinaryCrossEntropyLoss
backend = AerBackend() backend_config = { 'backend': backend, 'compilation': backend.default_compilation_pass(2), 'shots': 8192 }
model = TketModel.from_diagrams(cups_iqp_circuits, backend_config=backend_config)
trainer = QuantumTrainer( model, loss_function=BinaryCrossEntropyLoss(), epochs=20, optimizer=NelderMeadOptimizer, optim_hyperparams={'a': 0.05, 'c': 0.06, 'A':0.001*20}, evaluate_functions=eval_metrics, evaluate_on_train=True, verbose = 'text', seed=SEED )
X_train_grid, X_val_grid, y_train_grid, y_val_grid = train_test_split(cups_iqp_circuits, cups_iqp_targets, test_size=1/10, random_state=SEED) train_dataset = Dataset(X_train_grid, y_train_grid, batch_size=64) val_dataset = Dataset(X_val_grid, y_val_grid, shuffle=False)
trainer.fit(train_dataset, val_dataset, log_interval = 1)`
Everything works fine, untile on the trainer.fit(train_dataset, val_dataset, log_interval = 1) I get the following error:
`--------------------------------------------------------------------------- CircuitNotRunError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/pytket/extensions/qiskit/backends/aer.py in get_result(self, handle, **kwargs) 343 try: --> 344 return super().get_result(handle) 345 except CircuitNotRunError:
14 frames CircuitNotRunError: Circuit corresponding to ResultHandle('3ab8e99f-729a-47aa-b523-f7f16d58e163', 0, 8, 'null') has not been run by this backend instance.
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/pytket/extensions/qiskit/backends/aer.py in get_result(self, handle, **kwargs) 353 backresults = qiskit_result_to_backendresult(res) 354 for circ_index, backres in enumerate(backresults): --> 355 self._cache[ResultHandle(jobid, circ_index, qubit_n, ppc)][ 356 "result" 357 ] = backres
KeyError: ResultHandle('3ab8e99f-729a-47aa-b523-f7f16d58e163', 1, 8, 'null')`
Any suggestion on how to modify the code to make it work?