[Power Up] Variational Language Model

JonasLandman commented 3 years ago

Team Name:

TeamX

Project Description:

In this project, we developed a variational quantum algorithm for Natural Language Processing. Our goal is to train a quantum circuit such that it can process and recognize words. Applications varies from word matching, sentence completion, sentence generation and more. We use state-of-the-art deep learning word embedding and amplitude encoded quantum register, with a new ansatz and training methodology to perform this task, based on the swap test between words.

Source code:

https://github.com/Slimane33/qhack_project

Resource Estimate:

We can use AWS SV1 for parallelizing the gradient during the training. But the computational cost remains high due to the number of sentences and the total number of words in the dictionary.

With the current resource available, we estimate the training to be

For 10k sentences with 10 words per sentence / 2 qubits per word / 2 layers -> 4 days
For 10k sentences with 7 words per sentence / 3 qubits per word / 2 layers -> 10 days
We have started to generate a synthetic dataset to limit the resource consumption In any case, we might need more resources from AWS.
Number of qubits required: The quantum circuit to train corresponds to one sentence plus an extra word and an ancillary qubit, therefore Q*(N+1)+1 qubits. N being the number of words and Q the number of qubits per word. e.g : for a 4 words sentence with 3 qubits per word, we require 16 qubits. for a 5 words sentence with 4 qubits per word, we require 25 qubits.
Number of trainable parameters: The number of trainable parameters in the ansatz is around Q(1+N/2)L, where L is the number of layers, on average (it depends on the parity of the number of words and the number of qubits). e.g for a 4 words sentence with 3 qubits per word and 3 layers, we require 27 parameters.

glassnotes commented 3 years ago

Hi @JonasLandman , thank you for your submission! Could you please update the issue title with your project name?

JonasLandman commented 3 years ago

additional resources comment :

For 10k sentences with 10 words per sentence / 2 qubits per word / 2 layers -> 4 days ~ 412$
For 10k sentences with 7 words per sentence / 3 qubits per word / 2 layers -> 10 days ~ 1149$

co9olguy commented 3 years ago

Hi @JonasLandman, can you please confirm your team name is "TeamX" on the QML Challenges portal? There is also a "Team X", just want to make sure we connect to the right team

Slimane33 commented 3 years ago

Hello, no the correct team is Team X.

JonasLandman commented 3 years ago

Yes sorry @co9olguy our name is "Team X" with @Slimane33 My bad there's a duplicate on qhack board.

co9olguy commented 3 years ago

Is "TeamX" also you? i.e., do you control both logins?

Slimane33 commented 3 years ago

Yes "TeamX" is also us.

co9olguy commented 3 years ago

Thanks for your Power Up Submission @JonasLandman !

To help us keep track of final submissions, we will be closing all of the [Power Up] issues. We ask you to open a new issue for your final submission. Please use this pre-formatted [Entry] Issue template. Note that for the final submission, the Resource Estimate requirement is replaced by a Presentation item.

XanaduAI / QHack2021