lucidrains / recurrent-memory-transformer-pytorch

Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
MIT License
393 stars 15 forks source link

Question: how to adapt this for CTC loss #9

Open pfeatherstone opened 1 year ago

pfeatherstone commented 1 year ago

@lucidrains Do you have any advice on how to adapt RecurrentMemoryTransformerWrapper such that it works with CTC ?

pfeatherstone commented 1 year ago

In the memory replay backpropagation algorithm, the labels are partitioned in the same way as the logits. The loss is evaluated per block. For CTC that doesn't make sense since labels are not necessarily aligned.... So does memory replay in its current form even apply to CTC?? Any help is gratefully received.

pfeatherstone commented 1 year ago

@lucidrains Or if we forget CTC, can you think of a way to make this work with unaligned targets ?