Closed dimkart closed 1 year ago
@nikhilkhatri wrote:
Hi @yousrabou, It would help to know the following:
I'd recommend taking a look at the documentation for the loss function, which describes the expected param shapes.
This will be closed due to inactivity.
I use 1 qubit for noun type and 0 qubit for sentence types.
from lambeq import AtomicType, IQPAnsatz, remove_cups
ansatz = IQPAnsatz({AtomicType.NOUN: 1, AtomicType.SENTENCE: 0},
n_layers=1, n_single_qubit_params=3)
train_circuits = [ansatz(remove_cups(diagram)) for diagram in raw_train_diagrams] val_circuits = [ansatz(remove_cups(diagram)) for diagram in raw_val_diagrams]
train_circuits[0].draw(figsize=(9, 10))
2. i prepare my true targets as follows :
from lambeq import Dataset
train_dataset = Dataset( train_circuits, train_labels, batch_size=BATCH_SIZE)
when using 1 qubit for both noun and sentence types , another error is produced :'(
> ---------------------------------------------------------------------------
> ValueError Traceback (most recent call last)
> [<ipython-input-28-04c3311cae81>](https://localhost:8080/#) in <cell line: 1>()
> ----> 1 trainer.fit(train_dataset, logging_step=12)
>
> 6 frames
> [/usr/local/lib/python3.10/dist-packages/lambeq/training/tket_model.py](https://localhost:8080/#) in get_diagram_output(self, diagrams)
> 111 result = self._normalise_vector(tensors.array)
> 112 return result.reshape(1, *result.shape)
> --> 113 return np.array([self._normalise_vector(t.array) for t in tensors])
> 114
> 115 def forward(self, x: list[Diagram]) -> np.ndarray:
>
> ValueError: could not broadcast input array from shape (2,2) into shape (2,)
Hi @yousrabou ,
Could you check what the codomains of the diagrams are in raw_train_diagrams
.
You need to choose ansatz hyperparameters such that all these codomains are mapped to the same number of qubits.
If this is not the case, your target measurement is not going to be well-defined, since each sentence will produce a quantum state of a different shape.
The codomains are : n , s and n.r (open wires ) Where i observe that there are two codomains (n and n.r ) in the same diagram (i don't know if this is helpfull )
What should i do in this case to have the same number of qubits !
NOTE : i'm using the RelPron dataset used before in QNLP littérature.
Could this be the error where you get two wires for s instead of one? I got this before, the issue was with the data. I remember it occurred for some specific sentences.
If I recall correctly it was for imperative sentences, i.e. "Go brush your teeth", because there were two verbs like so. I think @y-richie-y remembers it, I asked him and when he explained it to me, I removed those sentences and done, it was fixed.
After double-checking my dataset, I discovered that many sentences (not just one) have two open wires. an exemple is shown bellow :
The parser tagged 'player' as the verb (s), and expected a noun on the right side!! the same occurs for other sentences
The problem is fixed after eliminating all of these sentences _
Should I open a new issue concerning the parser because it incorrectly parses the sentences?
Yep. Removing them fixes the problem, but this isn't what I meant. Imperative ones are tricky because POS tagging sees : Go do sth as two verbs, not one. However, for your case, it's strange that "player" is seen as a verb.
Although I can see where it's coming from. You have a badly written sentence as far as I can tell.
"Player that pitcher strike" is not a grammatically correct sentence. Try changing it to "Player that pitcher strikes". I think the issue is not from the parser, it's from your sentences. @dimkart Am I correct or missing something?
Thanks @ACE07-Sev
It's well parsed when I parse the sentence independently 'player that pitcher strike' with the function sentence2diagram or sentences2diagrams.
However, when applied to the entire dataset ( using sentences2diagrams ), the parser improperly parses numerous phrases; this is a highly unusual problem. I wish there was another option other than removing these sentences.
I see. Well, if it works with sentence2diagram, it should also work with sentences2diagrams, I'll check the src, but I think sentences2diagrams is just doing sentence2diagram iteratively. Nothing more or less than that.
Can you show me what code produces the error and send me your dataset? It's Amirali by the way Yousra, feel free to send it over discord and I'll have a look. In the time being, IF using sentence2diagram is fixing your issue (you don't have to delete the sentences I mean), then just run it iteratively for your entire dataset one by one.
Thanks @ACE07-Sev for your helpfull advices.
Hi, parsers are statistical tools and some times inevitably make mistakes. Having said that, note:
n
, and indeed gives that in my system:(main) dimkart@Dimitris-MBP ~/w/l/docs ((75099d05…))> lambeq "player that pitcher strikes"
player that pitcher strikes
────── ─────────────── ─────── ─────────
n n.r·n·n.l.l·s.l n n.r·s·n.l
╰──────╯ │ │ │ ╰──────╯ │ │
│ │ ╰────────────────╯ │
│ ╰────────────────────────╯
n.r s
, which is the right type for this kind of sentence, since the subject of the sentence is missing.(main) dimkart@Dimitris-MBP ~/w/l/docs ((75099d05…))> lambeq "Do your homework"
Do your homework
───────── ───── ────────
n.r·s·n.l n·n.l n
│ │ ╰───╯ ╰──────╯
One solution for this is to write a rewrite rule (or modifying the diagrams directly if you prefer) so that, when the sentence has two output wires, it adds a box that combines them into one s wire. This should be done at the string diagram level, i.e. before converting to circuits. Another is to use a compositional scheme that is not based to syntax, e.g. a StairsReader. See this discussion for more information.
Thank you @dimkart.
Thank you @dimkart.
This will be closed as resolved.
Originally posted by @yousrabou in https://github.com/CQCL/lambeq/issues/83#issuecomment-1540459499
Using:
the following error is produced !!