dfdazac / blp

"Inductive Entity Representations from Text via Link Prediction" @ The Web Conference 2021
MIT License
58 stars 6 forks source link

Update neg sampling docstring #3

Closed suamin closed 2 years ago

suamin commented 2 years ago

Hi, thanks for the example in negative sampling docstring. Just a minor change to it.

dfdazac commented 2 years ago

Hi, thanks for contributing! The current example is fine though:

[[0, 3],
 [5, 3],
 [4, 6],
 [1, 7]]

It means that in the first row, the second entity that was originally 1, got replaced randomly by 3, which can also happen. Please let me know if this is not clear.

suamin commented 2 years ago

Hi, thank you for the clarification.

It might be that I misinterpreted this bit: they are then replaced with a random index from a row other than the row to which they belong that the selection of new indices must be from the corrupted indices i.e. [1, 2, 5, 6]. Following this with example, the updated matrix used [1, 5, 6] but not 2 so I thought it was mistake. Sorry for misunderstanding.

In case, it is helpful to illustrate the source of confusion, step-by-step:

# original
        [[0, 1],
         [2, 3],
         [4, 5],
         [6, 7]]

selected positions

        [[0, -],
         [-, 3],
         [4, -],
         [-, 7]]
selected = [1, 2, 5, 6]

now, we will replace them from the selected following the constraint that they shall not belong to the original row, so possible variations could be:

1.
        [[0, 6],
         [5, 3],
         [4, 2],
         [1, 7]]
2.
        [[0, 6],
         [5, 3],
         [4, 1],
         [2, 7]]
3.
        [[0, 2],
         [1, 3],
         [4, 6],
         [5, 7]]
etc.

the documentation confused me in thinking that we should sample from selected.