huckiyang / QuantumSpeech-QCNN

IEEE ICASSP 21 - Quantum Convolution Neural Networks for Speech Processing and Automatic Speech Recognition
90 stars 17 forks source link

!python main_qsr.py --mel 1 --quanv 1 : q_valid shape is 0 #5

Closed rkuo2000 closed 3 years ago

rkuo2000 commented 3 years ago

I am trying --mel 1 --quanv 1 to input Google Speech Commands v0.0.1 dataset q_valid.shape is 0

helper_q_tool.gen_qspeech q_train is getting correct q_train.shape but q_valid is not converted from x_valid, and having 0 shape ?

huckiyang commented 3 years ago

Hi @rkuo2000 I don't understand what is your question this time. For the decoding setting, I have checked and run through a clean Linux environment from this released code as shown in the readme. Not sure it is owing to the virtual environment issue you were using (e.g., Colab) or else. Cheers.

rkuo2000 commented 3 years ago

I guess q_valid also need 3 lines to append x_valid like x_train did , otherwise it is empty.

q_valid = []
print("\nQuantum pre-processing of test Speech:")
for idx, img in enumerate(x_valid):
    print("{}/{}        ".format(idx + 1, len(x_valid)), end="\r")
    q_valid.append(quanv(img, kr))
q_valid = np.asarray(q_valid)
huckiyang commented 3 years ago

Hi @rkuo2000 thank you for letting me know about this issue again. Yes, you are correct. There have some missing lines during the code merging. For q_valid decoding, just follow the q_train process could be feasible. I have updated the file. Sorry for the inconvenience. Good luck with future works.

Noted it takes around 2 to 3 weeks with an Intel i9 CPU for encoding if you would like to encode all speech wave files in the Google commands dataset v1 with the (2,2) quanv kernel.

rkuo2000 commented 3 years ago

Yes, if comment line #22 in data_generator.py, then it will convert the labels listed commands.

I also added --dataroot (for attaching kaggle dataset path) parser.add_argument("--dataroot", type = str, default = "../dataset", help = "dataroot path") args = parser.parse_args() train_audio_path = args.dataroot

I wonder if this QCNN model is good for Few-Shot keyword spotting (wakeup command detection)? Here's a Few-Shot keyword spotting paper for your reference.

huckiyang commented 3 years ago

Hello @rkuo2000 Yes, that part is for QPU simulation with IBM machines. See qiskit setup. Feel free to modify the code applying with your own arguments since the environment may be various.

Thank you for sharing this paper. I think the QCNN model is indeed having the potential for feature transfer and few-shot learning in speech and language processing but I am not an expert on keyword spotting honestly.

I think we are planning to submit a "QML for speech processing" tutorial to Interspeech 2021 but our interests are more in fundamental system design and some theoretical analysis on design a hybrid QML-DNN system for the community.

If you are interested in parameterized trainable quantum circuits for function approximation. Feel free to check another paper from our team last year. Media for 1st Award

Variational Quantum Circuits for Deep Reinforcement Learning

S Yen-Chi Chen, CH Huck Yang, J Qi, PY Chen, X Ma, HS Goan

Feel free to use the provided implementation as a reference for your research applications and good luck with that, sincerely!

rkuo2000 commented 3 years ago

Great work, thanks !