kathrinse / TabSurvey

Experiments on Tabular Data Models
MIT License
265 stars 60 forks source link

information in attributionsNone.json #12

Closed parsifal9 closed 1 year ago

parsifal9 commented 1 year ago

Hi Kathrin,

what is the format of the information in attributionsNone.json?

(Pytorch): python attributions.py --model_name TabNet

and then in R

library(rjson)
myData <- fromJSON(file="output/TabNet/Adult/attributionsNone.json")
myData[[1]]
# [1] "TabNet" "TabNet"
myData[[2]]
# [1] "None" "None"
myData[[3]]
#[1] "Adult" "Adult"
> all.equal(myData[[4]][[1]],myData[[4]][[2]])
#[1] TRUE
> myData[[4]][[2]][1]
#[[1]]
 #[1] 1.232716441 0.000000000 0.000000000 0.016803199 0.018363461 0.005803025
 #[7] 0.438206583 0.303667098 0.000000000 0.000000000 0.619914174 0.000000000
#[13] 1.449993968 0.000000000

Bye R

unnir commented 1 year ago

@tleemann

parsifal9 commented 1 year ago

Thanks for that link. I have read the paper the associated paper with great interest. However, it does not answer the somewhat simpler question that I am asking.

I have made some progress. For an input like

python attributions.py --globalbenchmark --numruns 1  --model_name TabNet

a file "global_benchmarkNone.json" is created and I get (using R)

myData <- fromJSON(file="output/TabNet/Adult/global_benchmarkNone.json")
myData[[1]]
#[1] "TabNet" "TabNet" "TabNet" "TabNet" "TabNet"
myData[[2]]
#[1] "MoRF" "MoRF" "MoRF" "MoRF" "LeRF"
plot(myData[[3]][[1]])   #this looks like the MoRF
lines(myData[[3]][[2]])  #this looks like the MoRF
lines(myData[[3]][[3]])
lines(myData[[3]][[4]])
lines(myData[[3]][[5]],col="red") #this looks like the LeRF curve

1) why 5 sets of results -- 4 MoRF and 1 LeRF 2) the numbers do not seem to match the values printed to the screen, which gives accuracy (0.852, 0.8566,.. ) with 14, 13,.. features i.e. higher than the values in global_benchmarkNone.json 3) I still haven't figured out attributions.py. Presumably it returns the attributions, but where are they?

Bye R

tleemann commented 1 year ago

Hi, when you run the attribution.py script, attributions for the Adult Dataset (currently only this dataset is supported, but it should be easy to extend it to other data) are computed with the model (using the attention maps or the attribution function supplied by the model). strategy=None boils down to the default strategy for the corresponding model. Then a file at output/<modeltype>/<dataset>/attributions<strategy>.json is created with the keys model, strategy dataset, attributions, where attributions is a (N,D) (N=number of samples, D=number of features)-matrix of attributions. If you additionally pass the --globalbenchmark option, the MostRelevantFirst (MoRF) and LeastRelevantFirst (LeRF) feature removal tests are run. The output is stored in another file output/<modeltype>/<dataset>/global_benchmark<strategy>.json, with keys model, order, accuracies, where accuracies contains the accuracies of the model with the feature successively removed. Normally, --numruns runs for MoRF and LeRF are executed.

Note: The output files are not overwritten, when you restart the code. Instead, the results of the different runs of the script are concatenated to a list in the json file. Therefore, if you run and abort (or the script crashed) the script 3 times (after MoRF was written), you will maybe only see the MoRF results from these runs in the file. If it runs until the end for the fourth time, it will append both a MoRF and a LeRF run to the file. Just make sure to delete the file before you start the script, if you do not wish this behavior. The implementation of the logging procedure can be found in attributions.py, and utils/io_utils.py, please take a look there for full details.

Hope this helps, Tobias