YuanGongND / gopt

Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".
BSD 3-Clause "New" or "Revised" License
148 stars 25 forks source link

step 3 in inference your own data confusion #19

Open amandeepbaberwal opened 1 year ago

amandeepbaberwal commented 1 year ago

Hello i am following you guide to infer my own data here and i am confused about step 20, where should i run the step 20 python code?? here is my tree from data raw_kaldi_gop │   └── librispeech │   ├── te_feats.csv │   ├── te_keys_phn.csv │   ├── te_keys_word.csv │   ├── te_labels_phn.csv │   ├── te_labels_word.csv │   ├── tr_feats.csv │   ├── tr_keys_phn.csv │   ├── tr_keys_word.csv │   ├── tr_labels_phn.csv │   └── tr_labels_word.csv ├── README.md ├── seq_data_librispeech │   ├── te_feat.npy │   ├── te_label_phn.npy │   ├── te_label_utt.npy │   ├── te_label_word.npy │   ├── tr_feat.npy │   ├── tr_label_phn.npy │   ├── tr_label_utt.npy │   └── tr_label_word.npy ├── seq_data_paiia │   ├── te_feat.npy │   ├── te_label_phn.npy │   ├── te_label_utt.npy │   ├── te_label_word.npy │   ├── tr_feat.npy │   ├── tr_label_phn.npy │   ├── tr_label_utt.npy │   └── tr_label_word.npy └── seq_data_paiib ├── te_feat.npy ├── te_label_phn.npy ├── te_label_utt.npy ├── te_label_word.npy ├── tr_feat.npy ├── tr_label_phn.npy ├── tr_label_utt.npy └── tr_label_word.npy

**My exp folder**
final.py

├── gopt-1e-3-3-1-25-24-gopt-librispeech-br │   └── result_summary.csv ├── gopt-1e-3-3-1-25-24-gopt-librispeech-br-0 │   ├── models │   │   └── best_audio_model.pth │   ├── preds │   │   ├── phn_pred.npy │   │   ├── phn_target.npy │   │   ├── utt_pred.npy │   │   ├── utt_target.npy │   │   ├── word_pred.npy │   │   └── word_target.npy │   └── result.csv ├── gopt-1e-3-3-1-25-24-gopt-librispeech-br-1 │   ├── models │   │   └── best_audio_model.pth │   ├── preds │   │   ├── phn_pred.npy │   │   ├── phn_target.npy │   │   ├── utt_pred.npy │   │   ├── utt_target.npy │   │   ├── word_pred.npy │   │   └── word_target.npy │   └── result.csv ├── gopt-1e-3-3-1-25-24-gopt-librispeech-br-2 │   ├── models │   │   └── best_audio_model.pth │   ├── preds │   │   ├── phn_pred.npy │   │   ├── phn_target.npy │   │   ├── utt_pred.npy │   │   ├── utt_target.npy │   │   ├── word_pred.npy │   │   └── word_target.npy │   └── result.csv ├── gopt-1e-3-3-1-25-24-gopt-librispeech-br-3 │   ├── models │   │   └── best_audio_model.pth │   ├── preds │   │   ├── phn_pred.npy │   │   ├── phn_target.npy │   │   ├── utt_pred.npy │   │   ├── utt_target.npy │   │   ├── word_pred.npy │   │   └── word_target.npy │   └── result.csv ├── gopt-1e-3-3-1-25-24-gopt-librispeech-br-4 │   ├── models │   │   └── best_audio_model.pth │   ├── preds │   │   ├── phn_pred.npy │   │   ├── phn_target.npy │   │   ├── utt_pred.npy │   │   ├── utt_target.npy │   │   ├── word_pred.npy │   │   └── word_target.npy │   └── result.csv └── README.md

YuanGongND commented 1 year ago

Please note this is not an official tutorial so you might need to figure it out by yourself, and you need to take care of the bug I pointed out in the readme file.

-Yuan

YuanGongND commented 1 year ago

A general suggestion is to first fully reproduce the original code on so762 and then get to your own data.