kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.
http://kaldi-asr.org
Other
14.03k stars 5.3k forks source link

when i run run.sh in gop_speechocean it return error in visualize_feat.py file #4831

Closed amandeepbaberwal closed 9 months ago

amandeepbaberwal commented 1 year ago

steps/align_mapped.sh: done aligning data. local/visualize_feats.py --phone-symbol-table data/lang_nosp/phones-pure.txt exp/gop_train/feat.scp data/local/scores.json exp/gop_train/feats.png Traceback (most recent call last): File "local/visualize_feats.py", line 75, in <module> main() File "local/visualize_feats.py", line 68, in main features = TSNE(n_components=2).fit_transform(features) File "/home/xyz/Desktop/kaldi-master/egs/gop_speechocean762/s5/env/lib/python3.8/site-packages/sklearn/manifold/_t_sne.py", line 1118, in fit_transform self._check_params_vs_input(X) File "/home/xyz/Desktop/kaldi-master/egs/gop_speechocean762/s5/env/lib/python3.8/site-packages/sklearn/manifold/_t_sne.py", line 828, in _check_params_vs_input if self.perplexity >= X.shape[0]: AttributeError: 'tuple' object has no attribute 'shape'

what it is expecting??

danpovey commented 1 year ago

@jimbozhang

jimbozhang commented 1 year ago

Hi @amandeepbaberwal , could you provide the following inputs of visualize_feats.py:

amandeepbaberwal commented 1 year ago

Hi sorry for the late reply, here are the files i have compressed it files.zip

btw i am following this guide gopt and using 1 speaker for train and the same one for test. Please note that if i run with original speechocean data it still throws the same error

jimbozhang commented 1 year ago

Hi @amandeepbaberwal , could you please share me the file exp/gop_train/feat.1.ark in your environment?

amandeepbaberwal commented 1 year ago

umm i kind of overwrote it as i was trying to solve the issue. Does visualize_feats.py file change the outcome of gop? or is it just for visulizing output? However i asked someone about the same issue and now it is not throwing errors. I changed the file like this visualize_feats.py.zip

jimbozhang commented 1 year ago

As its name suggests, visualize_feats.py is only responsible for visualizing the features and does not affect the scoring. It seems that the current version of the sklearn.manifold.TSNE module does not convert the input from List to ndarray. Thank you for fixing this issue.

amandeepbaberwal commented 1 year ago

could you please help me with an another question? about gop score output?

jimbozhang commented 1 year ago

I’m sorry, I don’t understand the question regarding the gop-score-output. Could you provide a detailed explanation?

amandeepbaberwal commented 1 year ago

I am confused about the gop-score-output of this guide gopt. I am following this guide for generating gop score for a single .wav file. It is generating following output tensor([[1.7145]]) tensor([[1.5356]]) tensor([[1.7058]]) tensor([[1.6954]]) tensor([[1.7448]]) tensor([[[1.1899], [1.1255], [1.1371], [1.2073], [1.1969], [1.1274], [1.1659], [1.1577], [1.0568], [1.0990], [1.1053], [1.1656], [1.0931], [1.1343], [1.0896], [1.1648], [0.9780], [1.1054], [1.1448], [1.0803], [1.1760], [1.0661], [1.1731], [1.1523], [1.0214], [1.1662], [0.7840], [0.8092], [0.7268], [0.8243], [0.5917], [0.7283], [0.6889], [0.5892], [0.7459], [0.7653], [0.6947], [0.7011], [0.8332], [0.7986], [0.7528], [0.8058], [0.7918], [0.7757], [0.7859], [0.8021], [0.7573], [0.7021], [0.8279], [0.7559]]]) tensor([[[ 0.1661], [ 0.1054], [ 0.1160], [ 0.2031], [ 0.1574], [ 0.1005], [ 0.1371], [ 0.1062], [-0.0253], [ 0.0198], [ 0.0989], [ 0.1436], [ 0.0078], [ 0.1346], [ 0.0224], [ 0.0781], [-0.1487], [ 0.0630], [ 0.0780], [-0.0311], [ 0.0782], [ 0.0074], [ 0.1364], [ 0.2067], [-0.0944], [ 0.0836], [ 1.0057], [ 1.0423], [ 0.9618], [ 1.0206], [ 0.8295], [ 0.9385], [ 0.9107], [ 0.8241], [ 0.9541], [ 0.9885], [ 0.9241], [ 0.9337], [ 1.0537], [ 1.0116], [ 0.9845], [ 1.0131], [ 1.0172], [ 0.9951], [ 1.0040], [ 1.0108], [ 0.9789], [ 0.9222], [ 1.0295], [ 0.9657]]]) tensor([[[0.7327], [0.6098], [0.6997], [0.7779], [0.6948], [0.6550], [0.7178], [0.6607], [0.4606], [0.5367], [0.7319], [0.7513], [0.5215], [0.6651], [0.5181], [0.5906], [0.3402], [0.6393], [0.6960], [0.4557], [0.5759], [0.5481], [0.7242], [0.8806], [0.3691], [0.6580], [0.9815], [1.0061], [0.9133], [1.0248], [0.8083], [0.9803], [0.8805], [0.8214], [0.9212], [0.9522], [0.8914], [0.9105], [1.0034], [0.9554], [0.9227], [0.9504], [0.9554], [0.9305], [0.9597], [0.9850], [0.9388], [0.8596], [0.9963], [0.9295]]]) tensor([[[1.0473], [0.9881], [1.0186], [1.0791], [1.0340], [0.9847], [1.0329], [0.9877], [0.8908], [0.9047], [0.9891], [1.0623], [0.9028], [1.0118], [0.9135], [0.9626], [0.8007], [0.9778], [0.9829], [0.9262], [0.9731], [0.9155], [1.0323], [1.0655], [0.8391], [0.9893], [1.0771], [1.1251], [1.0513], [1.1424], [0.9450], [1.0673], [0.9994], [0.9339], [1.0477], [1.0645], [1.0334], [1.0408], [1.1635], [1.1052], [1.0518], [1.1003], [1.0981], [1.0828], [1.0778], [1.1168], [1.0564], [1.0127], [1.1349], [1.0602]]]) on step 20: gop output this guide is generating is using this line

u1, u2, u3, u4, u5, p, w1, w2, w3 = gopt(t_input_feat.float(),t_phn.float())

accoding to my understanding u1....u5 are utterance-level scores (accuracy, completeness, fluency, prosodic, total), p is phone-level score and w1...w3 are word level, but i don't know how to turn these scores into human readable i.e 0-100 especially the output of 'p' and 'w1.....w3'

jimbozhang commented 1 year ago

I’m not familiar with gopt. I suggest you ask the authors of gopt about this question.

amandeepbaberwal commented 1 year ago

Hi @amandeepbaberwal , could you please share me the file exp/gop_train/feat.1.ark in your environment?

Hi @jimbozhang can we calculate sentence Fluency using feat.1.ark file? i extracted it and i got vectors. I think gop is extracted from this file if i am not wrong.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.