Are there lack of evaluation scripts in the Controlled Demographic experiments?

shauli-ravfogel / nullspace_projection

MIT License

87 stars 21 forks source link

Are there lack of evaluation scripts in the Controlled Demographic experiments? #3

Closed W-lw closed 4 years ago

W-lw commented 4 years ago

When I run the script ./run_deepmoji_debiasing.sh I found the result was a npy file which stores the projection matrix P instead of TPR-GAP value. I looked through the whole project and couldn't find the experiment that evaluated tPR-GAP in Section 6.2. Could you give me some hints or scripts about it

yanaiela commented 4 years ago

Hey,

sorry about that, the evaluation is in a notebook

Please note that we had a bug in the evaluation, where the test set was not the same across the different ratios, and the ratios were different. This is now fixed, whereas the test is equal and the test ratios are balanced. However, this does not change the trends we reported originally.

We will update the paper with the new numbers soon. But the current referred notebook is updated with the new numbers.

W-lw commented 4 years ago

Thank you for your notices! I will evaluate it according to your new experimental setup.