asyml / texar-pytorch

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/
https://asyml.io
Apache License 2.0
745 stars 117 forks source link

Question about the usage of helper in TransformerDecoder #228

Closed ha-lins closed 5 years ago

ha-lins commented 5 years ago

Hi~ I want to implement the step-by-step TransformerDecoder with a TrainingHelper(), but I don't know how to call the same forward function as the RNN's, e.g.

outputs, hidden = self.gru(embedded, hidden) # forward for every step

Has it been done in the step method of the class helper ? Hope for your help!

gpengzhi commented 5 years ago

Thank you for your interest in Texar-PyTorch!

The forward function for every step (here) is implemented in TransformerDecoder.

You can take a look at self._inputs_to_outputs, where we have outputs, state = self._inputs_to_outputs(inputs, state).

ha-lins commented 5 years ago

In fact, I have no idea of how the initialize/step function is called or works. And the step function requires the helper argument. What should I do to build such class of helper? Is there any corresponding example for this?

Thanks!

huzecong commented 5 years ago

Sorry but I'm a bit confused. Is your goal to write (from scratch) a new Helper class, or to use an alternative built-in helper with the decoder?

ha-lins commented 5 years ago

@huzecong Specifically, I want to modify an alternative helper (e.g. TrainingHelper) instead of writing (from scratch). So I may need a corresponding example to help me to understand and then modify. For example, how to set the embedding_fn in the https://texar-pytorch.readthedocs.io/en/latest/code/modules.html#texar.torch.modules.TrainingHelper ?

huzecong commented 5 years ago

You don't need to do that yourself. helper.initialize is called inside decoder.initialize, where the decoder will pass its own embedding_fn to the helper.

The flow of execution would be:

  1. Construct a helper.
  2. Call decoder.forward with the constructed helper.
  3. decoder.forward calls DecoderBase.dynamic_decode.
  4. DecoderBase.dynamic_decode calls decoder.initialize, which in turns calls helper.initialize.
  5. DecoderBase.dynamic_decode loops over each time step and calls decoder.step, which in turn calls helper.sample.
  6. If the current step is not the final time step, DecoderBase.dynamic_decode calls decoder.next_inputs, which calls helper.next_inputs.
  7. DecoderBase.dynamic_decode calls decoder.finalize.

This may not apply to every decoder--helper pair but is a general description of how things work.

ha-lins commented 5 years ago

Thanks for your detailed instruction!