-
# Transformer | asterich's blog
Transformer是一个基于自注意力机制的深度学习模型,它完全摒弃了传统的RNN和LSTM结构,而是完全依赖于注意力机制来捕获序列中的依赖关系。 自从2017年被Vaswani等人在论文《Attention Is All You Need》中提出后,已经成为了NLP领域乃至整个AI领域的一个重要里程碑。它的
[https://…
-
https://stackoverflow.com/questions/42861460/how-to-interpret-weights-in-a-lstm-layer-in-keras?utm_medium
https://github.com/keras-team/keras/issues/3088
for e in zip(model.layers[0].trainable…
-
Hello Professor!
I have a question about this subsection: compare the performance against the naive LSTM approach. Is there any specific architecture that I need to compare my target solution with?…
-
I used attention here.
```
def decoding_layer(dec_input,encoder_outputs, encoder_state,source_sequence_length,
target_sequence_length, max_target_sequence_length,
…
-
### 请提出你的问题 Please ask your question
|--|--|
|:---:|:---:|
| Option | Value |
| mode…
-
Ref: https://github.com/tesseract-ocr/langdata/pull/76#issuecomment-320425422
copied below
>Hello
I'm a software engineering student and i use tesseract OCR engine in a university project. For…
-
Hi Pooya,
Thanks for releasing your code.
Though the processed attention map is released, I am wondering if the teacher module available? I'd like to apply it in my project and I'd appreciate it…
-
I would like to pass the output of a VGG16 feature extractor (-1, 512, 7, 7) onto 2 sequential-identical convLSTM layers say layers with hidden state (7 *7 *256). Could you let me know how to pass the…
-
Hi, slundberg,
**versions:**
keras 2.2.4
tensorflow 1.12.0
python2.7.18
CPU
shap 0.28.5
**Code piece:**
```
word_embedding_layer = Embedding(WORD_VOCAB_LEN,
W…
-
Hello, I am using the COCO dataset,
A two-layer LSTM model, one layer for top-down attention, and one layer for language models.
Extracting words with jieba
I used all the words in the picture de…