Closed vijayaditya closed 8 years ago
Has anyone done item 4 yet? If so, I can do it.
@david-ryan-snyder not yet, please go ahead.
I am interested in this task. I will work on it
@snidada1 It would be very convenient if you keep updating this issue with the progress reports rather than just sending @jtrmal and me the mails.
We once again need help with this issue, please let us know if you are interested.
I've conducted some experiments with multiple languages using chain recipes, and I'd like to point a change I had to conduct in the binaries:
nnet-chain-training: The denominator graph constructor uses the 'output' layer dimension as default. The quickest way to deal with it was defining default arguments through the call stack up to the binary opts, and let the user change the 'output' layer by a custom output for a specific language. This shouldn't impact any other experiments, but oh it looks ugly. Is there any other way to get the graph dimension without checking a specific output?
The experiments I made were all done through manual configuration of nnet3 configs. Since I was training on a single GPU instance, I made alternating calls to the TrainOneIteration function for each language (English and Portuguese), each with its proper arguments.
If you are computing the cost function using a particular output through out the training iteration then you could just the rename the output of interest as "output". You can do this by adding a binary which does the renaming when the model is passed to the chain trainer.
You might get better performance if you switch between languages more frequently e.g. at minibatch level. @pegahgh might be able to give you more information about this. However this would require a lot many more changes to the C++ code.
I think Pegah [her handle is @pegahgh] has been working on multi-language nnet3 training, but not chain training so far, as far as I know.
I believe the approach we decided on is to mix the languages together in the archives of egs on disk, but to have the nnet3-merge-egs program batch them up in such a way that it spits out batches containing only one language at a time. That means there are only a few computations to compile, and we don't spend half the time in compilation.
It looks like the core code in chain training doesn't assume that the output name is 'output', but does assume that the output name and its corresponding 'xent' output differ by an '-xent' suffix:
nnet-chain-training.cc: std::string xent_name = sup.name + "-xent"; // typically "output-xent".
So for chain training, it's probably only the egs generation that would have to be changed.
In general, I think I recommend changing the names of the output nodes to 'output' only for decoding. For training, it's probably better to modify the egs generation to use different output names per language, and this may require adding options to binaries. I hope that at some point we can work with Pegah to get this stuff checked in.
On Wed, Aug 17, 2016 at 1:20 PM, Vijayaditya Peddinti < notifications@github.com> wrote:
If you are computing cost function using a particular output through out the training iteration then you could just the rename the output of interest as "output". You can do this by adding a binary which does the renaming when the model is passed to the chain trainer.
You might get better performance if you switch between languages more frequently e.g. at minibatch level. @pegah https://github.com/pegah might be able to give you more information about this. However this would require a lot many more changes to the C++ code.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/661#issuecomment-240535015, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu88ZreDQVHjzSXay1iXyMbLpKJQMks5qg20WgaJpZM4IAgA_ .
Oh, I see now the specific line of code you were asking about:
NnetChainTrainer::NnetChainTrainer(const NnetChainTrainingOptions &opts,
const fst::StdVectorFst &den_fst,
Nnet *nnet):
opts_(opts),
den_graph_(den_fst, nnet->OutputDim("output")), // here
nnet_(nnet),
Probably to implement the multi-language chain training in the same way as I mentioned above for nnet3 training, we'd need to have multiple of these NnetChainTrainer objects, one per language, and pass the output name in somehow. Perhaps the NnetChainTrainer objects could be located in a map indexed by output name, and only actually initialized once we read in a minibatch [so we know the output name.] We can worry about this when we actually check in scripts and code for the rest of this, though.
Dan
On Wed, Aug 17, 2016 at 1:26 PM, Daniel Povey dpovey@gmail.com wrote:
I think Pegah [her handle is @pegahgh] has been working on multi-language nnet3 training, but not chain training so far, as far as I know.
I believe the approach we decided on is to mix the languages together in the archives of egs on disk, but to have the nnet3-merge-egs program batch them up in such a way that it spits out batches containing only one language at a time. That means there are only a few computations to compile, and we don't spend half the time in compilation.
It looks like the core code in chain training doesn't assume that the output name is 'output', but does assume that the output name and its corresponding 'xent' output differ by an '-xent' suffix:
nnet-chain-training.cc: std::string xent_name = sup.name + "-xent"; // typically "output-xent".
So for chain training, it's probably only the egs generation that would have to be changed.
In general, I think I recommend changing the names of the output nodes to 'output' only for decoding. For training, it's probably better to modify the egs generation to use different output names per language, and this may require adding options to binaries. I hope that at some point we can work with Pegah to get this stuff checked in.
On Wed, Aug 17, 2016 at 1:20 PM, Vijayaditya Peddinti < notifications@github.com> wrote:
If you are computing cost function using a particular output through out the training iteration then you could just the rename the output of interest as "output". You can do this by adding a binary which does the renaming when the model is passed to the chain trainer.
You might get better performance if you switch between languages more frequently e.g. at minibatch level. @pegah https://github.com/pegah might be able to give you more information about this. However this would require a lot many more changes to the C++ code.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/661#issuecomment-240535015, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu88ZreDQVHjzSXay1iXyMbLpKJQMks5qg20WgaJpZM4IAgA_ .
being addressed in #1027
We need a new training script, similar to steps/nnet3/chain/train.py for training with data from multiple languages. The changes required are as follows: