Closed tomohikoabe-gvatech closed 4 years ago
You are correct, strictly speaking we should compute distance on the original representation (text). However, to compute distance between two strings you have to represent them somehow. To make things simple, we just used the same binary representation used for explanation. We could have used count vectors, which would be a little more meaningful I guess.
If you want to implement a different distance function, you can use inverse_data
, which has the perturbed data in string form with the original string in position [0]
I'm currently working on text classification tasks with LSTM and would like to use LIME to help to explain the results.
I have two questions.
(1) I think there's difference between paper and code in calculating distance function
D(x,z)
(Eq.2 in the paper). Specifically, in the paper, the distance function is calculated over original representationsx
andz
inR^d
, on the other hand, in the code (lime/lime/lime_text.py
), it is calculated over interpretable (or binary) representationsx'
andz'
in{0,1}^d'
as follows:(2) If the calculation method in the paper is correct, i couldn't figure out how i could calculate the distance function
D(x,z)
as an input of a sequence of token vectors{x1, ..., xn}
.I would really appreciate it if you could respond to the questions.