problems for training dm-dnn

Hello,

Sorry for the late reply.

Unlike most tools in kaldi - which assumes the output of a DNN are posteriors - the output in the DNN used by idlak are always features matrices. You probably need to customise the binaries if you plan to use a different training tool than the one provided.

You are showing here the output features for training the duration model - which trains a mapping between "labels", i.e. features representing input phone identities, to the relevant durations, that will then be used to generate frame-level input for the acoustic model. On each line, you get 2 features: the first one is the duration of the current HMM state, while the second one is the duration of the phone. Assuming each phone is spread over 5 states, you will get a sequence of 5 features with the same phone duration, and the sum of all 5 HMM states should match the phone duration, i.e. 1 + 11 + 12 +11 + 8 = 43 if we consider the first phone here.

Not sure what you mean by dm-dnn training is completely failed :-)

Regards, Blaise

bpotard / idlak

problems for training dm-dnn #15