Problem with trainer.fit(), operands of different shape

Stephenito commented 2 years ago

Hi, I am trying to run the quantum trainer algorithm. When running the following line:

trainer.fit(train_dataset, val_dataset, evaluation_step=1, logging_step=100)

i get the following error:

ValueError                          Traceback (most recent call last)
Input In [17], in <cell line: 1>()
----> 1 trainer.fit(train_dataset, val_dataset, evaluation_step=1, logging_step=100)

File c:\python38\lib\site-packages\lambeq\training\trainer.py:365, in Trainer.fit(self, train_dataset, val_dataset, evaluation_step, logging_step)
    363 step += 1
    364 x, y_label = batch
--> 365 y_hat, loss = self.training_step(batch)
    366 if (self.evaluate_on_train and
    367         self.evaluate_functions is not None):
    368     for metr, func in self.evaluate_functions.items():

File c:\python38\lib\site-packages\lambeq\training\quantum_trainer.py:149, in QuantumTrainer.training_step(self, batch)
    133 def training_step(
    134         self,
    135         batch: tuple[list[Any], np.ndarray]) -> tuple[np.ndarray, float]:
    136     """Perform a training step.
    137 
    138     Parameters
   (...)
    147 
    148     """
--> 149     y_hat, loss = self.optimizer.backward(batch)
    150     self.train_costs.append(loss)
    151     self.optimizer.step()

File c:\python38\lib\site-packages\lambeq\training\spsa_optimizer.py:126, in SPSAOptimizer.backward(self, batch)
    124 self.model.weights = xplus
    125 y0 = self.model(diagrams)
--> 126 loss0 = self.loss_fn(y0, targets)
    127 xminus = self.project(x - self.ck * delta)
    128 self.model.weights = xminus

Input In [13], in <lambda>(y_hat, y)
----> 1 loss = lambda y_hat, y: -np.sum(y * np.log(y_hat)) / len(y)  # binary cross-entropy loss
      3 acc = lambda y_hat, y: np.sum(np.round(y_hat) == y) / len(y) / 2  # half due to double-counting
      4 eval_metrics = {"acc": acc}

ValueError: operands could not be broadcast together with shapes (30,2) (30,)

I have just fixed the .py file in the lib following #12. The algorithm raised an error even before. I can't recall exactly, but i don't think it was the same error.

What can i do to solve this? Thank you for your time.

Thommy257 commented 2 years ago

Hi, it seems as if your model output shape doesn't match the shape of the labels. How do you generate your diagrams? Also, the loss function from the tutorial notebooks is designed for 2-d outputs. If your model yields a scalar value, you need to modify it.

Stephenito commented 2 years ago

Hi, Labels are 2-d arrays, like in the documentation's example. I tried to change the labels to a 1-d array (changing the read_data function), and the program has a strange behaviour. It got stuck after the first epoch, with 0 loss function of both validation and training datasets. In order not to make any mistake in my code, i tried also with your full-code example of trainer_quantum, but still i got the same behaviour. The data is in this format: 1 woman teaches simple categories 1 woman describes simple maths I think it got parsed correctly, as the dataset arrays match your runs.

I will try to look at it in the next few days. Thanks for your help!

dimkart commented 2 years ago

@Stephenito Hi -- As @Thommy257 said, the problem is that while your labels are 2-D (as you confirm), the output of the model is 1-D (a scalar). After getting the output of the model, you need to convert it into 2-d before passing it to the loss function. Hope this helps.

Stephenito commented 2 years ago

Hi, instead of working with the model I made the labels 1-d. As i said before, the error is not there anymore, but the training is working with a strange behaviour. Each epoch is 40 seconds long, and the output is like this:

Epoch 1: train/loss: 0.0000 valid/loss: 0.0000 train/acc: 0.2458 valid/acc: 0.3000 Epoch 2: train/loss: 0.0000 valid/loss: 0.0000 train/acc: 0.2458 valid/acc: 0.3000

Training completed!

I tried with the following samples:

120 train, 10 test;
70 train, 30 test. but the train loss is still 0. What am i missing? Is it a problem of the dataset? Should i have more samples? Or do you think it's still a problem of the procedure?

Thanks again!

dimkart commented 2 years ago

Have you also adjusted your loss function? Or it still assumes your labels are 2-D?

Stephenito commented 2 years ago

Yes, i adjusted it for scalar values, but it returns the same behaviour. I tried to modify the Ansatz to make the output 2-d (and work like in the beginning with 2-d labels) and now it's giving normal values. Even though i haven't really understood what an Ansatz is and how to design it. I have a last question: why is it so slow? What should i modify to make it faster?

ACE07-Sev commented 2 years ago

Hi I am getting the same error

ACE07-Sev commented 2 years ago

The resolve to this error is due to one or more diagrams having 2 output wires, one way to resolve this is to manually check for all the diagrams and see which sentences have 2 output wires instead of one S wire output. Based on experience usually it's the sentences which start with a verb such as : Do not come here, Learn how to drive, kill the traitors, Love your neighbors, etc. If you have too many instances to check just make sure they all start with a noun, like "I, you, he, she, they, man, woman, it, person, names,etc.".

y-richie-y commented 2 years ago

@ACE07-Sev your issue is different, it arises from Bobcat correctly parsing imperative sentences to pregroup type n.r @ s. For example:

     Tell      me      what     you    think                                                                                         
─────────────  ──  ───────────  ───  ─────────
n.r·s·n.l·n.l  n   n·n.l.l·s.l   n   n.r·s·n.l
 │  │  │   ╰───╯   │   │    │    ╰────╯  │  │
 │  │  ╰───────────╯   │    ╰────────────╯  │
 │  │                  ╰────────────────────╯

y-richie-y commented 2 years ago

@Stephenito since the original issue has been resolved, I will close the issue.

The TketModel is typically used with IBM's Aer simulator which is much slower in comparison to NumpyModel. If you have problems with performance, please open a new issue.

CQCL / lambeq

Problem with trainer.fit(), operands of different shape #14