Closed jej127 closed 1 year ago
Hi Jeon, You can remove [:,1] from "predict_prob = t.predict_proba(X_train)[:,1]" to get all probabilities for all classes.
"notebooks/counterfactual_inference_example_huggingface.ipynb" seems to be two classes version of the bias model (entail vs non-entail), let me update it with the 3 classes version.
You can download the file from here.
Thanks for clear answer. I removed [:,1] from "predict_prob = t.predict_proba(X_train)[:,1]" and observed the result like array([[0.1, 0.9], [0.2, 0.8], ..., [0.6, 0.4], [0.7, 0.3]]), which is an (N x 2) array whose row sums are all 1.
On the other hand, each prediction in the file "dev_prob_korn_lr_overlapping_sample_weight_3class.jsonl" consists of three probabilities, such as "bias_probs":[0.17,0.37,0.46]. So I would be grateful if I know how to produce these triple predictions using LogisticRegression, which I understand is a method for handling binary response variables.
Thank you. Best regards, Jeon
There are two classes when you frame the NLI problem as entail vs non-entail. We will upload the three classes version soon. The main difference is the preprocessing step in which we combine both neutral and contradiction classes into one non-entail class. If you remove this step you will have an Nx3 array.
Thanks for explanation. It is helpful a lot! I close the issue.
https://github.com/c4n/debias_nlu/blob/main/notebooks/Bias_Model_3_classes.ipynb
We have uploaded the script
Hi, @c4n, I wonder how you obtained the file "dev_prob_korn_lr_overlapping_sample_weight_3class.jsonl" used in the code "notebooks/counterfactual_inference_example_huggingface.ipynb".
I ran the code "notebooks/Bias_Model_use_our_features.ipynb", but LogisticRegression produces a single probability for each instance, while your counterfactual inference methodology seems to require three probabilities whose sum is 1. (e.g. [0.7,0.2,0.1])
Thank you. Best regards, Jeon.