marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.54k stars 1.8k forks source link

Preserve order of text for explain_instance #667

Open paulmwatson opened 2 years ago

paulmwatson commented 2 years ago

Given the sentence:

The cat is a bad cat.

The current explain_instance returns:

[('bad', 0.023544989987054128), ('The', -2.223453279269586e-06), ('cat', -2.0328098267135788e-06), ('a', -1.29583902574453e-06), ('cat', -1.2776487837649124e-06), ('is', -1.1776258015649435e-06)]

The order makes it impossible to determine which cat is which.

Ideally the method would return in the input order:

[('The', -2.223453279269586e-06), ('cat', -2.0328098267135788e-06), ('is', -1.1776258015649435e-06), ('a', -1.29583902574453e-06), ('bad', 0.023544989987054128), ('cat', -1.2776487837649124e-06)]

I patched a fork for our needs but think this might be a useful option for others?

MNIKIEMA commented 2 years ago

Hello Paul, I got the same problem with multilabel classification: