haotianteng / Chiron

A basecaller for Oxford Nanopore Technologies' sequencers
Other
122 stars 53 forks source link

attention.py #75

Closed nbathreya closed 5 years ago

nbathreya commented 5 years ago

Hi. I am very new to tensorflow, ML and chiron. I had few basic questions about the implementation of the attention in chiron model.

It would be of the greatest help if I could get some feedback from you. I look forward to hearing back from you as soon as possible.

Nagendra

haotianteng commented 5 years ago

We compare Attention mechanism with CTC, however, attention didn't give as good result as CTC, so we turned to use CTC, but the code is still preserved.

The implementation is based on the following papers: https://arxiv.org/abs/1506.07503 https://arxiv.org/abs/1409.0473

nbathreya commented 5 years ago

So, when you implemented attention decoder, you just inserted in the attention_loss as prediction error?

Another question, in attention_loss function, you describe "label_len:[batch_size] label length, the symbol is included." Do we have to include "end" symbol at the end of each label batch as we pass to this command?

Is it possible to provide a training code with attention mechanism used? I would like to make sure I use the chiron model with attention right. (I want to see how much of a difference there is with attention and non-attention mechanism and see mathematically what is causing that). This would help me GREATLY!