CQCL / lambeq

A high-level Python library for Quantum Natural Language Processing
https://cqcl.github.io/lambeq-docs
Apache License 2.0
449 stars 108 forks source link

inference #8

Closed nlpirate closed 2 years ago

nlpirate commented 2 years ago

I'm trying to run Quantum pipeline using JAX backend. In order to better show results (how each sentence was classified according to the two different categories) , it is available an example of code to realise the inference as in the classic NLP deep learning approaches (i.e. like in transformers-based approaches or similar)

dimkart commented 2 years ago

Hi @nlpirate, not sure I understand what the request is. Do you need an example "that realises the inference as in the classic NLP deep learning approaches"? Because all of the training examples in the notebooks and the documentation are essentially based on standard supervised learning.

nlpirate commented 2 years ago

Very trivially, given an input sentence I would like to obtain in output the category in which it is classified

dimkart commented 2 years ago

You basically need to assign to the sentence space (S) a number of qubits that can encode the number of classes you have, e.g. for a binary classification task one qubit is enough. Very roughly:

  1. Create a diagram for your sentence
  2. Apply an ansatz to convert it into a circuit, using the appropriate number of qubits for your sentence space
  3. Measure the circuit if you run the experiment on a quantum simulation or on an actual quantum machine (as in this example) or contract the circuit if you are just using tensor nets as here, which gives you back a prediction for your class.
  4. Compute loss (e.g. BCE), update your parameters, and continue training

Essentially all the examples you can find in lambeq's documentation and in our papers implement some form of classification task. E.g. the "QNLP in practice" paper implements two classification models: https://arxiv.org/pdf/2102.12846.pdf.

nlpirate commented 2 years ago

thanks for the quick reply and the clarification about the pipeline! I am proceeding with the experiments, but I still can not get the output values of the system applied on the test set. Broadly speaking, I would like to see in oputput for each test case the corresponding label associated by the classifier.

In a "classical" deep learning approach, once the model is created/trained, one can save it to reload it when necessary. Is there a possibility in your approach to perform in a similar way? Since I cannot understand how to retrieve information about classification for each sentence without a model that I can reload

RobinWLorenz commented 2 years ago

Hi @nlpirate, I am not sure I understood the question correctly, but the labels that the model predicts can indeed be obtained in a simple way. It is maybe most easily explained using the concrete example notebook that @dimkart had already mentioned. After the line result = minimizeSPSA(...) in the one to last cell, the dictionary result contains in particular, result['x'], the parameters of the final model after training. Now, while the cost functions (e.g. train_cost_fn) in that notebook only return quantities that are functions of an entire dataset (such as cost and accuracy), there actually are prediction functions, which are called by the cost functions and that return predictions for individual sentences. In short, the predicted labels, e.g. for the train data can be obtained by np.round(train_pred_fn(result['x'])) where train_pred_fn is the function defined earlier in that same notebook. Analogously, dev_pred_fn or test_pred_fn give you the labels for the sentences in the dev and test datasets, respectively. Please let us know in case you still have any questions.

dimkart commented 2 years ago

Closed due to inactivity.