marcotcr / lime

Lime: Explaining the predictions of any machine learning classifier
BSD 2-Clause "Simplified" License
11.5k stars 1.79k forks source link

Facing issue for explain instance with custom classifier function #597

Closed mayurka closed 3 years ago

mayurka commented 3 years ago

exp = explainer.explain_instance(df_val_final.Description[idx],predproba_list,num_features=5, top_labels=2)

While executing the explain instance of LimeTextExplainer, above statement keeps on executing continously with below warning message. Execution stops only if i interrupt the kernel

C:\ProgramData\Anaconda3\lib\site-packages\fastai\torch_core.py:83: UserWarning: Tensor is int32: upgrading to int64; for better performance use int64 input warn('Tensor is int32: upgrading to int64; for better performance use int64 input') C:\ProgramData\Anaconda3\lib\site-packages\fastai\torch_core.py:83: UserWarning: Tensor is int32: upgrading to int64; for better performance use int64 input warn('Tensor is int32: upgrading to int64; for better performance use int64 input') C:\ProgramData\Anaconda3\lib\site-packages\fastai\torch_core.py:83: UserWarning: Tensor is int32: upgrading to int64; for better performance use int64 input warn('Tensor is int32: upgrading to int64; for better performance use int64 input')

I want to use my own custom classifier model and hence I wrote a classifier function - predproba_list, which returns a numpy array of predicted probabilties for the classes Below is the function code

def predproba_list(test1) : pred = learn_clf.predict(test1) return np.array(pred[2])

pred[2] vaue is tensor([0.1423, 0.2133, 0.6444]) which i then convert to a numpy array

Can you please advise if the return value of the function is as expected by the explain instance's classifier function, and what could be causing the code to keep on executing without any result

Thanks in advance

mayurka commented 3 years ago

Now I am getting the below error, ValueError: Found input variables with inconsistent numbers of samples: [5000, 1].

5000 is the default value for argument num_samples in function explain_instance() if it is not explicitly defined. How is the value for num_samples determined if need to set it explicitly

marcotcr commented 3 years ago

The output should be a 2d array , where columns are prediction probabilities for different labels. If you only have one label, it should still be (n, 1)