amanchadha / coursera-deep-learning-specialization

Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks and Deep Learning; (ii) Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization; (iii) Structuring Machine Learning Projects; (iv) Convolutional Neural Networks; (v) Sequence Models
3.14k stars 2.27k forks source link

C5_W4_A1_Transformer_EX3_scaled_attention_logits #45

Open mrgransky opened 1 year ago

mrgransky commented 1 year ago

Your scaled_attention_logits is calculated wrong, since it gives:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-41-00665b20febb> in <module>
      1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
     73     assert np.allclose(weights, [[0.30719590187072754, 0.5064803957939148, 0.0, 0.18632373213768005],
     74                                  [0.3836517333984375, 0.3836517333984375, 0.0, 0.2326965481042862],
---> 75                                  [0.3836517333984375, 0.3836517333984375, 0.0, 0.2326965481042862]]), "Wrong masked weights"
     76     assert np.allclose(attention, [[0.6928040981292725, 0.18632373213768005],
     77                                    [0.6163482666015625, 0.2326965481042862],

AssertionError: Wrong masked weights

The correct value should be:

if mask is not None: # Don't replace this None
        scaled_attention_logits += ( (1-mask) * -1e9 )

Cheers,