CQCL / lambeq

A high-level Python library for Quantum Natural Language Processing
https://docs.quantinuum.com/lambeq/
Apache License 2.0
452 stars 108 forks source link

NumpyModel not converging #20

Closed ACE07-Sev closed 2 years ago

ACE07-Sev commented 2 years ago

Hi, I have an issue with the NumpyModel where it doesn't converge to a good accuracy and low loss, I thought maybe that is due to my dataset and tried the previously tested food and IT dataset, same issue. I thought maybe this is because of my code, so I reran the sample notebook for the food and IT, and still the same result. Any ideas on how to resolve this? Also regardless of the dataset and the model, it always starts with approx train loss and valid loss of 0.76-0.8 and accuracy of around 40 percent, and just wobbles back and forth there. Could you kindly assist on how I can resolve this issue? I have also posted my github code for the entire thing on my repo as well. Here is the link : https://github.com/ACE07-Sev/QNLP

y-richie-y commented 2 years ago

Thanks for the issue! This issue has been addressed on discord, now the model trains properly. There were a multitude of things that needed to be fixed that are specific to this task, so I won't reiterate them here.

ACE07-Sev commented 2 years ago

Thanks to richie we were able to resolve this. As mentioned the issue thread is on discord, and for those who simply want a quick summary we : 1) Used random.shuffle to shuffle the datasets 2) Used Spiders_Reader as Bobcat couldn't perform to the most optimal extent (for instance Bobcat did 60 on training, 50 on valid and 47 on test set for accuracy, whereas Spiders_Reader got to 100 on training, 87 on validation and 77 on test after 300 epochs with a learning rate of 0.2 ) 3) Used NumpyModel instead of TKET for its higher accuracy in simulation of the QPU and speed

The use of hyperparameters is based on case by case basis, depending on the size of the dataset and the homogeneity of the instances as a whole, but for our instance we used 0.2 learning rate, n_layers = 1, and a total of three AtomicTypes for NOUN, SENTENCE and PREPOSITIONAL_PHRASE, each set to 1 qubit. You can find the corrected script on my repo for a better understanding. https://github.com/ACE07-Sev/QNLP

Again this issue was resolved by the brilliance of richie in isolating the issue to the parser. Thank you so much.