Closed danhoonlee closed 1 year ago
Hello @danhoonlee
From what you describe, TimeSHAP can work in your use case. If you want explanations for each of the checkups, you can use TimeSHAP to explain each checkup individually given their respective previous events.
Regarding the error message, changing the baseline sequence is not a classification specific case although it is more direct in that domain. You can try different baseline sequences that suit your domain to obtain different results.
Regarding the option "skipping the pruning algorithm", I have double checked the code and it will not solve this error message (my previous answer in the commit is outdated). The value 0.1 is hard coded and it is more tailored for classification problems. If this problem persists, feel free to tell me and I can change this value to be user-defined in order to get around cases like this.
Note that currently, 0.1 is the minimum allowed difference between the explained sequence and the baseline, which represents the budget that TimeSHAP will have to distribute among the explained axis (event/feature/cell). If this value is too small, the explanations will be very small also.
If you have any further questions, don't hesitate to contact us.
Closed this issue due to inactivity. If you have any further questions feel free to re-open the issue or create a new one.
Hello again @JoaoPBSousa! Thanks for your help last time.
With your help I was working on the implementation of TimeSHAP on my model. However, I encountered another problem and I think it seems like I have to ask you about this before any continuation.
My model is a regression model with GRU. It is a regression model predicting medicine concentration in patients' body with multiple injections and multiple check-ups. For example, patient A might go through 4 concentration check-ups and before each check-up, this patient would go through medicine injections where the number of injections varies each time. Below is an example of 4 checkups with different number of injections prior to each checkup. (injection1 - injection2 - injection3 - checkup1 - injection4 - injection5 - checkup2 - injection6 - checkup3 - injection7 - injection8 - injection9 - injection10 - checkup4[end])
Then, my model would have predictions at each checkup, meaning that this sequence would have 4 regression results with the last one being the most important. So, if only one regression output may be used, I can certainly take the last result.
If I am not mistaken, I believe both of your examples(bank account and AReM) are classification cases? I want to ask if I could use TimeSHAP in my model under this condition.
One more thing to ask is while running the code, I received a message "Score difference between baseline and instance is too low < 0.1...Consider choosing another baseline." I have seen another issue post with this and found that I could either Skip the pruning algorithm or Change the baseline sequence. Would the change the baseline sequence be only possible with classification case?
I hope my explanations and questions are clear. Thank you so much for your kind help. I really appreciate it!