bringtree / question_embedding

这个仓库的issues里记录了许多奇奇怪怪的东西(100+)。
1 stars 1 forks source link

kaldi 调试 #171

Open bringtree opened 5 years ago

bringtree commented 5 years ago
struct NnetIo {
  std::string name;
  std::vector<Index> indexes;
  GeneralMatrix features;
GeneralMatrix:
    Matrix<BaseFloat> mat_;
  CompressedMatrix cmat_;
  SparseMatrix<BaseFloat> smat_;

为了方便调试 把 mat 中 protect 改成public

然后 打印一下

(kaldi::Matrix<float>) $12 = {
  kaldi::MatrixBase<float> = {
    data_ = 0x000000010208a000
    num_cols_ = 40
    num_rows_ = 220
    stride_ = 40 // True number of columns ???
  }
}
  Real*   data_;

data_ 是个地址,不过没事

接下来要判断的是 只改 data_的值就好了吗 还是说要做什么操作?

所以还是有去分析一下 ark的文本结构。(另外 scp好像是可以索引到ark的 )

另外 ark,t: 可以让输出变成文本,默认的ark:是二进制文件。

另外 还有 ark,scp: 至于啥用就不清楚了ark,scp:xxxx.ark,xxxx.scp

[huangps@gpu61 train]$ ll
total 70280
-rw-r--r-- 1 sunyq kaiwoo   278343 Apr 26 16:19 cmvn.scp
-rw-r--r-- 1 sunyq kaiwoo 20987958 Apr 26 16:19 feats.scp
-rw-r--r-- 1 sunyq kaiwoo    97520 Apr 26 16:19 reco2file_and_channel
-rw-r--r-- 1 sunyq kaiwoo 12434632 Apr 26 16:19 segments
-rw-r--r-- 1 sunyq kaiwoo  6388384 Apr 26 16:19 spk2utt
-rw-r--r-- 1 sunyq kaiwoo 22020307 Apr 26 16:19 text
-rw-r--r-- 1 sunyq kaiwoo  8981134 Apr 26 16:19 utt2spk
-rw-r--r-- 1 sunyq kaiwoo   765532 Apr 26 16:19 wav.scp
[huangps@gpu61 train]$ head feats.scp
sw02001-A_000098-001156 /reserve/sunyq/feat/mfcc/raw_mfcc_train.1.ark:24
sw02001-A_001980-002131 /reserve/sunyq/feat/mfcc/raw_mfcc_train.1.ark:13901
sw02001-A_002736-002893 /reserve/sunyq/feat/mfcc/raw_mfcc_train.1.ark:15987
sw02001-A_003390-004012 /reserve/sunyq/feat/mfcc/raw_mfcc_train.1.ark:18151
sw02001-A_004012-004155 /reserve/sunyq/feat/mfcc/raw_mfcc_train.1.ark:26360

ok 开始分析ark中的东西

<Nnet3ChainEg>
    <NumInputs> 
        <NnetIo>
        <NumOutputs>
            <NnetChainSup>
                <I1V> <I1>... 
                <Supervision> 
                Weight NumSequences FramesPerSeq LabelDim End2End
                <Fsts> kfst有关 出发点 目的点 输入 输出 权重 
                <AlignmentPdfs>             

其次 矩阵操作

ll exp/chain/tdnn7q_noivector_sp/egs/

cp -r /reserve/sunyq/egs/swbd/s5c/exp/chain/tdnn7q_sp /reserve/huangps/swbd/s5c/exp/chain/tdnn7q_sp cp -r /reserve/sunyq/egs/swbd/s5c/exp/chain/tri5_7d_tree_sp /reserve/huangps/swbd/s5c/exp/chain/tri5_7d_tree_sp

nnet3-chain-copy-egs --frame-shift=1 ark:/reserve/sunyq/egs/swbd/s5c/exp/chain/tdnn7q_noivector_sp/egs/cegs.1.ark ark,t:./tset.ark:- |nnet3-chain-spec-augment --mask_row_times=1 --mask_col_times=1 --max_mask_row_width=10 --max_mask_col_width=10 ark,t:tset.ark ark,t:./xx.ark
// Documentation for "rspecifier"
// "rspecifier" describes how we read a set of objects indexed by keys.
// The possibilities are:
//
// ark:rxfilename
// scp:rxfilename
//
// We also allow various modifiers:
//   o   means the program will only ask for each key once, which enables
//       the reader to discard already-asked-for values.
//   s   means the keys are sorted on input (means we don't have to read till
//       eof if someone asked for a key that wasn't there).
//   cs  means that it is called in sorted order (we are generally asserting
//       this based on knowledge of how the program works).
//   p   means "permissive", and causes it to skip over keys whose corresponding
//       scp-file entries cannot be read. [and to ignore errors in archives and
//       script files, and just consider the "good" entries].
//       We allow the negation of the options above, as in no, ns, np,
//       but these aren't currently very useful (just equivalent to omitting the
//       corresponding option).
//       [any of the above options can be prefixed by n to negate them, e.g. no,
//       ns, ncs, np; but these aren't currently useful as you could just omit
//       the option].
//   bg means "background".  It currently has no effect for random-access readers,
//       but for sequential readers it will cause it to "read ahead" to the next
//       value, in a background thread.  Recommended when reading larger objects
//       such as neural-net training examples, especially when you want to
//       maximize GPU usage.
//
//   b   is ignored [for scripting convenience] , opts->binary = true 
//   t   is ignored [for scripting convenience] , opts->binary = false 
//
//
//  So for instance the following would be a valid rspecifier:
//
//   "o, s, p, ark:gunzip -c foo.gz|"
bringtree commented 5 years ago

另外 调试的时候 要把 -01 改成 -00